home *** CD-ROM | disk | FTP | other *** search
Text File | 1991-04-20 | 258.7 KB | 7,131 lines |
-
-
- Network Working Group David Cheriton
- Request for Comments: 1045 Stanford University
- February 1988
-
-
- VMTP: VERSATILE MESSAGE TRANSACTION PROTOCOL
- Protocol Specification
-
-
-
- STATUS OF THIS MEMO
-
- This RFC describes a protocol proposed as a standard for the Internet
- community. Comments are encouraged. Distribution of this document is
- unlimited.
-
-
- OVERVIEW
-
- This memo specifies the Versatile Message Transaction Protocol (VMTP)
- [Version 0.7 of 19-Feb-88], a transport protocol specifically designed
- to support the transaction model of communication, as exemplified by
- remote procedure call (RPC). The full function of VMTP, including
- support for security, real-time, asynchronous message exchanges,
- streaming, multicast and idempotency, provides a rich selection to the
- VMTP user level. Subsettability allows the VMTP module for particular
- clients and servers to be specialized and simplified to the services
- actually required. Examples of such simple clients and servers include
- PROM network bootload programs, network boot servers, data sensors and
- simple controllers, to mention but a few examples.
-
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Table of Contents
-
- 1. Introduction 1
-
- 1.1. Motivation 2
- 1.1.1. Poor RPC Performance 2
- 1.1.2. Weak Naming 3
- 1.1.3. Function Poor 3
- 1.2. Relation to Other Protocols 4
- 1.3. Document Overview 5
-
- 2. Protocol Overview 6
-
- 2.1. Entities, Processes and Principals 7
- 2.2. Entity Domains 9
- 2.3. Message Transactions 10
- 2.4. Request and Response Messages 11
- 2.5. Reliability 12
- 2.5.1. Transaction Identifiers 13
- 2.5.2. Checksum 14
- 2.5.3. Request and Response Acknowledgment 14
- 2.5.4. Retransmissions 15
- 2.5.5. Timeouts 15
- 2.5.6. Rate Control 18
- 2.6. Security 19
- 2.7. Multicast 21
- 2.8. Real-time Communication 22
- 2.9. Forwarded Message Transactions 24
- 2.10. VMTP Management 25
- 2.11. Streamed Message Transactions 25
- 2.12. Fault-Tolerant Applications 28
- 2.13. Packet Groups 29
- 2.14. Runs of Packet Groups 31
- 2.15. Byte Order 32
- 2.16. Minimal VMTP Implementation 33
- 2.17. Message vs. Procedural Request Handling 33
- 2.18. Bibliography 34
-
- 3. VMTP Packet Formats 37
-
- 3.1. Entity Identifier Format 37
- 3.2. Packet Fields 38
-
-
-
-
-
-
-
- Cheriton [page i]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 3.3. Request Packet 45
- 3.4. Response Packet 47
-
- 4. Client Protocol Operation 49
-
- 4.1. Client State Record Fields 49
- 4.2. Client Protocol States 51
- 4.3. State Transition Diagrams 51
- 4.4. User Interface 52
- 4.5. Event Processing 53
- 4.6. Client User-invoked Events 54
- 4.6.1. Send 54
- 4.6.2. GetResponse 56
- 4.7. Packet Arrival 56
- 4.7.1. Response 58
- 4.8. Management Operations 61
- 4.8.1. HandleNoCSR 62
- 4.9. Timeouts 64
-
- 5. Server Protocol Operation 66
-
- 5.1. Remote Client State Record Fields 66
- 5.2. Remote Client Protocol States 66
- 5.3. State Transition Diagrams 67
- 5.4. User Interface 69
- 5.5. Event Processing 70
- 5.6. Server User-invoked Events 71
- 5.6.1. Receive 71
- 5.6.2. Respond 72
- 5.6.3. Forward 73
- 5.6.4. Other Functions 74
- 5.7. Request Packet Arrival 74
- 5.8. Management Operations 78
- 5.8.1. HandleRequestNoCSR 79
- 5.9. Timeouts 82
-
- 6. Concluding Remarks 84
-
- I. Standard VMTP Response Codes 85
-
- II. VMTP RPC Presentation Protocol 87
-
-
-
-
-
-
-
-
- Cheriton [page ii]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- II.1. Request Code Management 87
-
- III. VMTP Management Procedures 89
-
- III.1. Entity Group Management 100
- III.2. VMTP Management Digital Signatures 101
-
- IV. VMTP Entity Identifier Domains 102
-
- IV.1. Domain 1 102
- IV.2. Domain 3 104
- IV.3. Other Domains 105
- IV.4. Decentralized Entity Identifier Allocation 105
-
- V. Authentication Domains 107
-
- V.1. Authentication Domain 1 107
- V.2. Other Authentication Domains 107
-
- VI. IP Implementation 108
-
- VII. Implementation Notes 109
-
- VII.1. Mapping Data Structures 109
- VII.2. Client Data Structures 111
- VII.3. Server Data Structures 111
- VII.4. Packet Group transmission 112
- VII.5. VMTP Management Module 113
- VII.6. Timeout Handling 114
- VII.7. Timeout Values 114
- VII.8. Packet Reception 115
- VII.9. Streaming 116
- VII.10. Implementation Experience 117
-
- VIII. UNIX 4.3 BSD Kernel Interface for VMTP 118
-
- Index 120
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page iii]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- List of Figures
-
- Figure 1-1: Relation to Other Protocols 4
- Figure 3-1: Request Packet Format 45
- Figure 3-2: Response Packet Format 47
- Figure 4-1: Client State Transitions 52
- Figure 5-1: Remote Client State Transitions 68
- Figure III-1: Authenticator Format 92
- Figure VII-1: Mapping Client Identifier to CSR 109
- Figure VII-2: Mapping Server Identifiers 110
- Figure VII-3: Mapping Group Identifiers 111
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page iv]
-
- RFC 1045 VMTP February 1988
-
-
- 1. Introduction
-
- The Versatile Message Transaction Protocol (VMTP) is a transport
- protocol designed to support remote procedure call (RPC) and general
- transaction-oriented communication. By transaction-oriented
- communication, we mean that:
-
- - Communication is request-response: A client sends a request
- for a service to a server, the request is processed, and the
- server responds. For example, a client may ask for the next
- page of a file as the service. The transaction is terminated
- by the server responding with the next page.
-
- - A transaction is initiated as part of sending a request to a
- server and terminated by the server responding. There are no
- separate operations for setting up or terminating associations
- between clients and servers at the transport level.
-
- - The server is free to discard communication state about a
- client between transactions without causing incorrect behavior
- or failures.
-
- The term message transaction (or transaction) is used in the reminder of
- this document for a request-response exchange in the sense described
- above.
-
- VMTP handles the error detection, retransmission, duplicate suppression
- and, optionally, security required for transport-level end-to-end
- reliability.
-
- The protocol is designed to provide a range of behaviors within the
- transaction model, including:
-
- - Minimal two packet exchanges for short, simple transactions.
-
- - Streaming of multi-packet requests and responses for efficient
- data transfer.
-
- - Datagram and multicast communication as an extension of the
- transaction model.
-
- Example Uses:
-
- - Page-level file access - VMTP is intended as the transport
- level for file access, allowing simple, efficient operation on
- a local network. In particular, VMTP is appropriate for use
- by diskless workstations accessing shared network file
-
-
- Cheriton [page 1]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- servers.
-
- - Distributed programming - VMTP is intended to provide an
- efficient transport level protocol for remote procedure call
- implementations, distributed object-oriented systems plus
- message-based systems that conform to the request-response
- model.
-
- - Multicast communication with groups of servers to: locate a
- specific object within the group, update a replicated object,
- synchronize the commitment of a distributed transaction, etc.
-
- - Distributed real-time control with prioritized message
- handling, including datagrams, multicast and asynchronous
- calls.
-
- The protocol is designed to operate on top of a simple unreliable
- datagram service, such as is provided by IP.
-
-
- 1.1. Motivation
-
- VMTP was designed to address three categories of deficiencies with
- existing transport protocols in the Internet architecture. We use TCP
- as the key current transport protocol for comparison.
-
-
- 1.1.1. Poor RPC Performance
-
- First, current protocols provide poor performance for remote procedure
- call (RPC) and network file access. This is attributable to three key
- causes:
-
- - TCP requires excessive packets for RPC, especially for
- isolated calls. In particular, connection setup and clear
- generates extra packets over that needed for VMTP to support
- RPC.
-
- - TCP is difficult to implement, speaking purely from the
- empirical experience over the last 10 years. VMTP was
- designed concurrently with its implementation, with focus on
- making it easy to implement and providing sensible subsets of
- its functionality.
-
- - TCP handles packet loss due to overruns poorly. We claim that
- overruns are the key source of packet loss in a
- high-performance RPC environment and, with the increasing
-
-
- Cheriton [page 2]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- performance of networks, will continue to be the key source.
- (Older machines and network interfaces cannot keep up with new
- machines and network interfaces. Also, low-end network
- interfaces for high-speed networks have limited receive
- buffering.)
-
- VMTP is designed for ease of implementation and efficient RPC. In
- addition, it provides selective retransmission with rate-based flow
- control, thus addressing all of the above issues.
-
-
- 1.1.2. Weak Naming
-
- Second, current protocols provide inadequate naming of transport-level
- endpoints because the names are based on IP addresses. For example, a
- TCP endpoint is named by an Internet address and port identifier.
- Unfortunately, this makes the endpoint tied to a particular host
- interface, not specifically the process-level state associated with the
- transport-level endpoint. In particular, this form of naming causes
- problems for process migration, mobile hosts and multi-homed hosts.
- VMTP provides host-address independent names, thereby solving the above
- mentioned problems.
-
- In addition, TCP provides no security and reliability guarantees on the
- dynamically allocated names. In particular, other than well-known
- ports, (host-addr, port-id)-tuples can change meaning on reboot
- following a crash. VMTP provides large identifiers with guarantee of
- stability, meaning that either the identifier never changes in meaning
- or else remains invalid for a significant time before becoming valid
- again.
-
-
- 1.1.3. Function Poor
-
- TCP does not support multicast, real-time datagrams or security. In
- fact, it only supports pair-wise, long-term, streamed reliable
- interchanges. Yet, multicast is of growing importance and is being
- developed for the Internet (see RFC 966 and 988). Also, a datagram
- facility with the same naming, transmission and reception facilities as
- the normal transport level is a powerful asset for real-time and
- parallel applications. Finally, security is a basic requirement in an
- increasing number of environments. We note that security is natural to
- implement at the transport level to provide end-to-end security (as
- opposed to (inter)network level security). Without security at the
- transport level, a transport level protocol cannot guarantee the
- standard transport level service definition in the presence of an
- intruder. In particular, the intruder can interject packets or modify
-
-
- Cheriton [page 3]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- packets while updating the checksum, making mockery out of the
- transport-level claim of "reliable delivery".
-
- In contrast, VMTP provides multicast, real-time datagrams and security,
- addressing precisely these weaknesses.
-
- In general, VMTP is designed with the next generation of communication
- systems in mind. These communication systems are characterized as
- follows. RPC, page-level file access and other request-response
- behavior dominates. In addition, the communication substrate, both
- local and wide-area, provides high data rates, low error rates and
- relatively low delay. Finally, intelligent, high-performance network
- interfaces are common and in fact required to achieve performance that
- approximates the network capability. However, VMTP is also designed to
- function acceptably with existing networks and network interfaces.
-
-
- 1.2. Relation to Other Protocols
-
- VMTP is a transport protocol that fits into the layered Internet
- protocol environment. Figure 1-1 illustrates the place of VMTP in the
- protocol hierarchy.
-
-
- +-----------+ +----+ +-----------------+ +------+
- |File Access| |Time| |Program Execution| |Naming|... Application
- +-----------+ +----+ +-----------------+ +------+ Layer
- | | | | |
- +-----------+-----------+-------------+------+
- |
- +------------------+
- | RPC Presentation | Presentation
- +------------------+ Layer
- |
- +------+ +--------+
- | TCP | | VMTP | Transport
- +------+ +--------+ Layer
- | |
- +-----------------------------------+
- | Internet Protocol & ICMP | Internetwork
- +-----------------------------------+ Layer
-
- Figure 1-1: Relation to Other Protocols
-
- The RPC presentation level is not currently defined in the Internet
- suite of protocols. Appendix II defines a proposed RPC presentation
- level for use with VMTP and assumed for the definition of the VMTP
- management procedures. There is also a need for the definition of the
-
-
- Cheriton [page 4]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Application layer protocols listed above.
-
- If internetwork services are not required, VMTP can be used without the
- IP layer, layered directly on top of the network or data link layers.
-
-
- 1.3. Document Overview
-
- The next chapter gives an overview of the protocol, covering naming,
- message structure, reliability, flow control, streaming, real-time,
- security, byte-ordering and management. Chapter 3 describes the VMTP
- packet formats. Chapter 4 describes the client VMTP protocol operation
- in terms of pseudo-code for event handling. Chapter 5 describes the
- server VMTP protocol operation in terms of pseudo-code for event
- handling. Chapter 6 summarizes the state of the protocol, some
- remaining issues and expected directions for the future. Appendix I
- lists some standard Response codes. Appendix II describes the RPC
- presentation protocol proposed for VMTP and used with the VMTP
- management procedures. Appendix III lists the VMTP management
- procedures. Appendix IV proposes initial approaches for handling entity
- identification for VMTP. Appendix V proposes initial authentication
- domains for VMTP. Appendix VI provides some details for implementing
- VMTP on top of IP. Appendix VII provides some suggestions on host
- implementation of VMTP, focusing on data structures and support
- functions. Appendix VIII describes a proposed program interface for
- UNIX 4.3 BSD and its descendants and related systems.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 5]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 2. Protocol Overview
-
- VMTP provides an efficient, reliable, optionally secure transport
- service in the message transaction or request-response model with the
- following features:
-
- - Host address-independent naming with provision for multiple
- forms of names for endpoints as well as associated (security)
- principals. (See Sections 2.1, 2.2, 3.1 and Appendix IV.)
-
- - Multi-packet request and response messages, with a maximum
- size of 4 megaoctets per message. (Sections 2.3 and 2.14.)
-
- - Selective retransmission. (Section 2.13.) and rate-based flow
- control to reduce overrun and the cost of overruns. (Section
- 2.5.6.)
-
- - Secure message transactions with provision for a variety of
- encryption schemes. (Section 2.6.)
-
- - Multicast message transactions with multiple response messages
- per request message. (Section 2.7.)
-
- - Support for real-time communication with idempotent message
- transactions with minimal server overhead and state (Section
- 2.5.3), datagram request message transactions with no
- response, optional header-only checksum, priority processing
- of transactions, conditional delivery and preemptive handling
- of requests (Section 2.8)
-
- - Forwarded message transactions as an optimization for certain
- forms of nested remote procedure calls or message
- transactions. (Section 2.9.)
-
- - Multiple outstanding (asynchronous) message transactions per
- client. (Section 2.11.)
-
- - An integrated management module, defined with a remote
- procedure call interface on top of VMTP providing a variety of
- communication services (Section 2.10.)
-
- - Simple subset implementation for simple clients and simple
- servers. (Section 2.16.)
-
- This chapter provides an overview of the protocol as introduction to the
- basic ideas and as preparation for the subsequent chapters that describe
- the packet formats and event processing procedures in detail.
-
-
- Cheriton [page 6]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- In overview, VMTP provides transport communication between network-
- visible entities via message transactions. A message transaction
- consists of a request message sent by the client, or requestor, to a
- group of server entities followed by zero or more response messages to
- the client, at most one from each server entity. A message is
- structured as a message control portion and a segment data portion. A
- message is transmitted as one or more packet groups. A packet group is
- one or more packets (up to a maximum of 32 packets) grouped by the
- protocol for acknowledgment, sequencing, selective retransmission and
- rate control.
-
- Entities and VMTP operations are managed using a VMTP management
- mechanism that is accessed through a procedural interface (RPC)
- implemented on top of VMTP. In particular, information about a remote
- entity is obtained and maintained using the Probe VMTP management
- operation. Also, acknowledgment information and requests for
- retransmission are sent as notify requests to the management module.
- (In the following description, reference to an "acknowledgment" of a
- request or a response refers to a management-level notify operation that
- is acknowledging the request or response.)
-
-
- 2.1. Entities, Processes and Principals
-
- VMTP defines and uses three main types of identifiers: entity
- identifiers, process identifiers and principal identifiers, each 64-bits
- in length. Communication takes place between network-visible entities,
- typically mapping to, or representing, a message port or procedure
- invocation. Thus, entities are the VMTP communication endpoints. The
- process associated with each entity designates the agent behind the
- communication activity for purposes of resource allocation and
- management. For example, when a lock is requested on a file, the lock
- is associated with the process, not the requesting entity, allowing a
- process to use multiple entity identifiers to perform operations without
- lock conflict between these entities. The principal associated with an
- entity specifies the permissions, security and accounting designation
- associated with the entity. The process and principal identifiers are
- included in VMTP solely to make these values available to VMTP users
- with the security and efficiency provided by VMTP. Only the entity
- identifiers are actively used by the protocol.
-
- Entity identifiers are required to have three properties;
-
- Uniqueness Each entity identifier is uniquely defined at any given
- time. (An entity identifier may be reused over time.)
-
- Stability An entity identifier does not change between valid
-
-
- Cheriton [page 7]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- meanings without suitable provision for removing
- references to the entity identifier. Certain entity
- identifiers are strictly stable, (i.e. never changing
- meaning), typically being administratively assigned
- (although they need not be bound to a valid entity at
- all times), often called well-known identifiers. All
- other entity identifiers are required to be T-stable,
- not change meaning without having remained invalid for
- at least a time interval T.
-
- Host address independent
- An entity identifier is unique independent of the host
- address of its current host. Moreover, an entity
- identifier is not tied to a single Internet host
- address. An entity can migrate between hosts, reside on
- a mobile host that changes Internet addresses or reside
- on a multi-homed host. It is up to the VMTP
- implementation to determine and maintain up to date the
- host addresses of entities with which it is
- communicating.
-
- The stability of entity identifiers guarantees that an entity identifier
- represents the same logical communication entity and principal (in the
- security sense) over the time that it is valid. For example, if an
- entity identifier is authenticated as having the privileges of a given
- user account, it continues to have those privileges as long as it is
- continuously valid (unless some explicit notice is provided otherwise).
- Thus, a file server need not fully authenticate the entity on every file
- access request. With T-stable identifiers, periodically checking the
- validity of an entity identifier with period less than T seconds detects
- a change in entity identifier validity.
-
- A group of entities can form an entity group, which is a set of zero or
- more entities identified by a single entity identifier. For example,
- one can have a single entity identifier that identifies the group of
- name servers. An entity identifier representing an entity group is
- drawn from the same name space as entity identifiers. However, single
- entity identifiers are flagged as such by a bit in the entity
- identifier, indicating that the identifier is known to identify at most
- one entity. In addition to the group bit, each entity identifier
- includes other standard type flags. One flag indicates whether the
- identifier is an alias for an entity in another domain (See Section 2.2
- below.). Another flag indicates, for an entity group identifier,
- whether the identifier is a restricted group or not. A restricted group
- is one in which an entity can be added only by another entity with group
- management authorization. With an unrestricted group, an entity is
- allowed to add itself. If an entity identifier does not represent a
-
-
- Cheriton [page 8]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- group, a type bit indicates whether the entity uses big-endian or
- little-endian data representation (corresponding to Motorola 680X0 and
- VAX byte orders, respectively). Further specification of the format of
- entity identifiers is contained in Section 3.1 and Appendix IV.
-
- An entity identifier identifies a Client, a Server or a group of
- Servers <1>. A Client is always identified by a T-stable identifier. A
- server or group of servers may be identified by a a T-stable identifier
- (group or single entity) or by strictly stable (statically assigned)
- entity group identifier. The same T-stable identifier can be used to
- identify a Client and Server simultaneously as long as both are
- logically associated with the same entity. The state required for
- reliable, secure communication between entities is maintained in client
- state records (CSRs), which include the entity identifier of the Client,
- its principal, its current or next transaction identifier and so on.
-
-
- 2.2. Entity Domains
-
- An entity domain is an administration or an administration mechanism
- that guarantees the three required entity identifier properties of
- uniqueness, stability and host address independence for the entities it
- administers. That is, entity identifiers are only guaranteed to be
- unique and stable within one entity domain. For example, the set of all
- Internet hosts may function as one domain. Independently, the set of
- hosts local to one autonomous network may function as a separate domain.
- Each entity domain is identified by an entity domain identifier, Domain.
- Only entities within the same domain may communicate directly via VMTP.
- However, hosts and entities may participate in multiple entity domains
- simultaneously, possibly with different entity identifiers. For
- example, a file server may participate in multiple entity domains in
- order to provide file service to each domain. Each entity domain
- specifies the algorithms for allocation, interpretation and mapping of
- entity identifiers.
-
- Domains are necessary because it does not appear feasible to specify one
- universal VMTP entity identification administration that covers all
- entities for all time. Domains limit the number of entities that need
- to be managed to maintain the uniqueness and stability of the entity
-
- _______________
-
- <1> Terms such as Client, Server, Request, Response, etc. are
- capitalized in this document when they refer to their specific meaning
- in VMTP.
-
-
- Cheriton [page 9]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- name space. Domains can also serve to separate entities of different
- security levels. For instance, allocation of a unclassified entity
- identifier cannot conflict with secret level entity identifiers because
- the former is interpreted only in the unclassified domain, which is
- disjoint from the secret domain.
-
- It is intended that there be a small number of domains. In particular,
- there should be one (or a few) domains per installation "type", rather
- than per installation. For example, the Internet is expected to use one
- domain per security level, resulting in at most 8 different domains.
- Cluster-based internetwork architectures, those with a local cluster
- protocol distinct from the wide-area protocol, may use one domain for
- local use and one for wide-area use.
-
- Additional details on the specification of specific domains is provided
- in Appendix IV.
-
-
- 2.3. Message Transactions
-
- The message transaction is the unit of interaction between a Client that
- initiates the transaction and one or more Servers. A message
- transaction starts with a request message generated by a client. At
- the service interface, a server becomes involved with a transaction by
- receiving and accepting the request. A server terminates its
- involvement with a transaction by sending a response message. In a
- group message transaction, the server entity designated by the client
- corresponds to a group of entities. In this case, each server in the
- group receives a copy of the request. In the client's view, the
- transaction is terminated when it receives the response message or, in
- the case of a group message transaction, when it receives the last
- response message. Because it is normally impractical to determine when
- the last response message has been received. the current transaction is
- terminated by VMTP when the next transaction is initiated.
-
- Within an entity domain, a transaction is uniquely identified by the
- tuple (Client, Transaction, ForwardCount). where Transaction is a
- 32-bit number and ForwardCount is a 4-bit value. A Client uses
- monotonically increasing Transaction identifiers for new message
- transactions. Normally, the next higher transaction number, modulo
- 2**32, is used for the next message transaction, although there are
- cases in which it skips a small range of Transaction identifiers. (See
- the description of the STI control flag.) The ForwardCount is used when
- a message transaction is forwarded and is zero otherwise.
-
- A Client generates a stream of message transactions with increasing
- transaction identifiers, directed at a diversity of Servers. We say a
-
-
- Cheriton [page 10]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Client has a transaction outstanding if it has invoked a message
- transaction, but has not received the last Response (or possibly any
- Response). Normally, a Client has only one transaction outstanding at a
- time. However, VMTP allows a Client to have multiple message
- transactions outstanding simultaneously, supporting streamed,
- asynchronous remote procedure call invocations. In addition, VMTP
- supports nested calls where, for example, procedure A calls procedure B
- which calls procedure C, each on a separate host with different client
- entity identifiers for each call but identified with the same process
- and principal.
-
-
- 2.4. Request and Response Messages
-
- A message transaction consists of a request message and one or more
- Response messages. A message is structured as message control block
- (MCB) and segment data, passed as parameters, as suggested below.
-
- +-----------------------+
- | Message Control Block |
- +-----------------------+
- +-----------------------------------+
- | segment data |
- +-----------------------------------+
-
- In the request message, the MCB specifies control information about the
- request plus an optional data segment. The MCB has the following
- format:
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- + ServerEntityId (8 octets) +
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Flags | RequestCode |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- + CoresidentEntity (8 octets) +
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- > User Data (12 octets) <
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | MsgDelivery |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | SegmentSize |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
- The ServerEntityId is the entity to which the Request MCB is to be sent
- (or was sent, in the case of reception). The Flags indicate various
- options in the request and response handling as well as whether the
-
-
- Cheriton [page 11]
-
-
- RFC 1045 VMTP February 1988
-
-
- CoresidentEntity, MsgDelivery and SegmentSize fields are in use. The
- RequestCode field specifies the type of Request. It is analogous to a
- packet type field of the Ethernet, acting as a switch for higher-level
- protocols. The CoresidentEntity field, if used, designates a subgroup
- of the ServerEntityId group to which the Request should be routed,
- namely those members that are co-resident with the specified entity (or
- entity group). The primary intended use is to specify the manager for a
- particular service that is co-resident with a particular entity, using
- the well-known entity group identifier for the service manager in the
- ServerEntityId field and the identifier for the entity in the
- CoresidentEntity field. The next 12 octets are user- or
- application-specified.
-
- The MsgDelivery field is optionally used by the RPC or user level to
- specify the portions of the segment data to transmit and on reception,
- the portions received. It provides the client and server with
- (optional) access to, and responsibility for, a simple selective
- transmission and reception facility. For example, a client may request
- retransmission of just those portions of the segment that it failed to
- receive as part of the original Response. The primary intended use is
- to support highly efficient multi-packet reading from a file server.
- Exploiting user-level selective retransmission using the MsgDelivery
- field, the file server VMTP module need not save multi-packet Responses
- for retransmission. Retransmissions, when needed, are instead handled
- directly from the file server buffers.
-
- The SegmentSize field indicates the size of the data segment, if
- present. The CoresidentEntity, MsgDelivery and SegmentSize fields are
- usable as additional user data if they are not otherwise used.
-
- The Flags field provides a simple mechanism for the user level to
- communicate its use of VMTP options with the VMTP module as well as for
- VMTP modules to communicate this use among themselves. The use of these
- options is generally fixed for each remote procedure so that an RPC
- mechanism using VMTP can treat the Flags as an integral part of the
- RequestCode field for the purpose of demultiplexing to the correct stub.
-
- A Response message control block follows the same format except the
- Response is sent from the Server to the Client and there is no
- Coresident Entity field (and thus 20 octets of user data).
-
-
- 2.5. Reliability
-
- VMTP provides reliable, sequenced transfer of request and response
- messages as well as several variants, such as unreliable datagram
- requests. The reliability mechanisms include: transaction identifiers,
-
-
- Cheriton [page 12]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- checksums, positive acknowledgment of messages and timeout and
- retransmission of lost packets.
-
-
- 2.5.1. Transaction Identifiers
-
- Each message transaction is uniquely identified by the pair (Client,
- Transaction). (We defer discussion of the ForwardCount field to Section
- 2.9.) The 32-bit transaction identifier is initialized to a random
- value when the Client entity is created or allocated its entity
- identifier. The transaction identifier is incremented at the end of
- each message transaction. All Responses with the same specified
- (Client, Transaction) pair are associated with this Request.
-
- The transaction identifier is used for duplicate suppression at the
- Server. A Server maintains a state record for each Client for which it
- is processing a Request, identified by (Client, Transaction). A Request
- with the same (Client, Transaction) pair is discarded as a duplicate.
- (The ForwardCount field must also be equal.) Normally, this record is
- retained for some period after the Response is sent, allowing the Server
- to filter out subsequent duplicates of this Request. When a Request
- arrives and the Server does not have a state record for the sending
- Client, the Server takes one of three actions:
-
- 1. The Server may send a Probe request, a simple query
- operation, to the VMTP management module associated with the
- requesting Client to determine the Client's current
- Transaction identifier (and other information), initialize a
- new state record from this information, and then process the
- Request as above.
-
- 2. The Server may reason that the Request must be a new request
- because it does not have a state record for this Client if it
- keeps these state records for the maximum packet lifetime of
- packets in the network (plus the maximum VMTP retransmission
- time) and it has not been rebooted within this time period.
- That is, if the Request is not new either the Request would
- have exceeded the maximum packet lifetime or else the Server
- would have a state record for the Client.
-
- 3. The Server may know that the Request is idempotent or can be
- safely redone so it need not care whether the Request is a
- duplicate or not. For example, a request for the current
- time can be responded to with the current time without being
- concerned whether the Request is a duplicate. The Response
- is discarded at the Client if it is no longer of interest.
-
-
-
- Cheriton [page 13]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 2.5.2. Checksum
-
- Each VMTP packet contains a checksum to allow the receiver to detect
- corrupted packets independent of lower level checks. The checksum field
- is 32 bits, providing greater protection than the standard 16-bit IP
- checksum (in combination with an improved checksum algorithm). The
- large packets, high packet rates and general network characteristics
- expected in the future warrant a stronger checksum mechanism.
-
- The checksum normally covers both the VMTP header and the segment data.
- Optionally (for real-time applications), the checksum may apply only to
- the packet header, as indicated by the HCO control bit being set in the
- header. The checksum field is placed at the end of the packet to allow
- it to be calculated as part of a software copy or as part of a hardware
- transmission or reception packet processing pipeline, as expected in the
- next generation of network interfaces. Note that the number of header
- and data octets is an integral multiple of 8 because VMTP requires that
- the segment data be padded to be a multiple of 64 bits. The checksum
- field is appended after the padding, if any. The actual algorithm is
- described in Section 3.2.
-
- A zero checksum field indicates that no checksum was transmitted with
- the packet. VMTP may be used without a checksum only when there is a
- host-to-host error detection mechanism and the VMTP security facility is
- not being used. For example, one could rely on the Ethernet CRC if
- communication is restricted to hosts on the same Ethernet and the
- network interfaces are considered sufficiently reliable.
-
-
- 2.5.3. Request and Response Acknowledgment
-
- VMTP assumes an unreliable datagram network and internetwork interface.
- To guarantee delivery of Requests and Response, VMTP uses positive
- acknowledgments, retransmissions and timeouts.
-
- A Request is normally acknowledged by receipt of a Response associated
- with the Request, i.e. with the same (Client, Transaction). With
- streamed message transactions, it may also be acknowledged by a
- subsequent Response that acknowledges previous Requests in addition to
- the transaction it explicitly identifies. A Response may be explicitly
- acknowledged by a NotifyVmtpServer operation requested of the manager
- for the Server. In the case of streaming, this is a cumulative
- acknowledgment, acknowledging all Responses with a lower transaction
- identifier as well.) In addition, with non-streamed communication, a
- subsequent Request from the same Client acknowledges Responses to all
- previous message transactions (at least in the sense that either the
- client received a Response or is no longer interested in Responses to
-
-
- Cheriton [page 14]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- those earlier message transactions). Finally, a client response timeout
- (at the server) acknowledges a Response at least in the sense that the
- server need not be prepared to retransmit the Response subsequently.
- Note that there is no end-to-end guarantee of the Response being
- received by the client at the application level.
-
-
- 2.5.4. Retransmissions
-
- In general, a Request or Response is retransmitted periodically until
- acknowledged as above, up to some maximum number of retransmissions.
- VMTP uses parameters RequestRetries(Server) and ResponseRetries(Client)
- that indicate the number of retransmissions for the server and client
- respectively before giving up. We suggest the value 5 be used for both
- parameters based on our experience with VMTP and Internet packet loss.
- Smaller values (such as 3) could be used in low loss environments in
- which fast detection of failed hosts or communication channels is
- required. Larger values should be used in high loss environments where
- transport-level persistence is important.
-
- In a low loss environment, a retransmission only includes the MCB and
- not the segment data of the Request or Response, resulting in a single
- (short) packet on retransmission. The intended recipient of the
- retransmission can request selective retransmission of all or part of
- the segment data as necessary. The selective retransmission mechanism
- is described in Section 2.13.
-
- If a Response is specified as idempotent, the Response is neither
- retransmitted nor stored for retransmission. Instead, the Client must
- retransmit the Request to effectively get the Response retransmitted.
- The server VMTP module responds to retransmissions of the Request by
- passing the Request on to the server again to have it regenerate the
- Response (by redoing the operation), rather than saving a copy of the
- Response. Only Request packets for the last transaction from this
- client are passed on in this fashion; older Request packets from this
- client are discarded as delayed duplicates. If a Response is not
- idempotent, the VMTP module must ensure it has a copy of the Response
- for retransmission either by making a copy of the Response (either
- physically or copy-on-write) or by preventing the Server from continuing
- until the Response is acknowledged.
-
-
- 2.5.5. Timeouts
-
- There is one client timer for each Client with an outstanding
- transaction. Similarly, there is one server timer for each Client
- transaction that is "active" at the server, i.e. there is a transaction
-
-
- Cheriton [page 15]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- record for a Request from the Client.
-
- When the client transmits a new Request (without streaming), the client
- timer is set to roughly the time expected for the Response to be
- returned. On timeout, the Request is retransmitted with the APG
- (Acknowledge Packet Group) bit set. The timeout is reset to the
- expected roundtrip time to the Server because an acknowledgment should
- be returned immediately unless a Response has been sent. The Request
- may also be retransmitted in response to receipt of a VMTP management
- operation indicating that selected portions of the Request message
- segment need to be retransmitted. With streaming, the timeout applies
- to the oldest outstanding message transaction in the run of outstanding
- message transactions. Without streaming, there is one message
- transaction in the run, reducing to the previous situation. After the
- first packet of a Response is received, the Client resets the timeout to
- be the time expected before the next packet in the Response packet group
- is received, assuming it is a multi-packet Response. If not, the timer
- is stopped. Finally, the client timer is used to timeout waiting for
- second and subsequent Responses to a multicast Request.
-
- The client timer is set at different times to four different values:
-
- TC1(Server) The expected time required to receive a Response from
- the Server. Set on initial Request transmission plus
- after its management module receives a NotifyVmtpClient
- operation, acknowledging the Request.
-
- TC2(Server) The estimated round trip delay between the client and
- the server. Set when retransmitting after receiving no
- Response for TC1(Server) time and retransmitting the
- Request with the APG bit set.
-
- TC3(Server) The estimated maximum expected interpacket time for
- multi-packet Responses from the Server. Set when
- waiting for subsequent Response packets within a packet
- group before timing out.
-
- TC4 The time to wait for additional Responses to a group
- Request after the first Response is received. This is
- specified by the user level.
-
- These values are selected as follows. TC1 can be set to TC2 plus a
- constant, reflecting the time within which most servers respond to most
- requests. For example, various measurements of VMTP usage at Stanford
- indicate that 90 percent of the servers respond in less than 200
- milliseconds. Setting TC1 to TC2 + 200 means that most Requests receive
- a Response before timing out and also that overhead for retransmission
-
-
- Cheriton [page 16]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- for long running transactions is insignificant. A sophisticated
- implementation may make the estimation of TC1 further specific to the
- Server.
-
- TC2 may be estimated by measuring the time from when a Probe request is
- sent to the Server to when a response is received. TC2 can also be
- measured as the time between the transmission of a Request with the APG
- bit set to receipt of a management operation acknowledging receipt of
- the Request.
-
- When the Server is an entity group, TC1 and TC2 should be the largest of
- the values for the members of the group that are expected to respond.
- This information may be determined by probing the group on first use
- (and using the values for the last responses to arrive). Alternatively,
- one can resort to default values.
-
- TC3 is set initially to 10 times the transmission time for the maximum
- transmission unit (MTU) to be used for the Response. A sophisticated
- implementation may record TC3 per Server and refine the estimate based
- on measurements of actual interpacket gaps. However, a tighter estimate
- of TC3 only improves the reaction time when a packet is lost in a packet
- group, at some cost in unnecessary retransmissions when the estimate
- becomes overly tight.
-
- The server timer, one per active Client, takes on the following values:
-
- TS1(Client) The estimated maximum expected interpacket time. Set
- when waiting for subsequent Request packets within a
- packet group before timing out.
-
- TS2(Client) The time to wait to hear from a client before
- terminating the server processing of a Request. This
- limits the time spent processing orphan calls, as well
- as limiting how out of date the server's record of the
- Client state can be. In particular, TS2 should be
- significantly less than the minimum time within which it
- is reasonable to reuse a transaction identifier.
-
- TS3(Client) Estimated roundtrip time to the Client,
-
- TS4(Client) The time to wait after sending a Response (or last
- hearing from a client) before discarding the state
- associated with the Request which allows it to filter
- duplicate Request packets and regenerate the Response.
-
- TS5(Client) The time to wait for an acknowledgment after sending a
- Response before retransmitting the Response, or giving
-
-
- Cheriton [page 17]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- up (after some number of retransmissions).
-
- TS1 is set the same as TC3.
-
- The suggested value for TS2 is TC1 + 3*TC2 for this server, giving the
- Client time to timeout waiting for a Response and retransmit 3 Request
- packets, asking for acknowledgments.
-
- TS3 is estimated the same as TC1 except that refinements to the estimate
- use measurements of the Response-to-acknowledgment times.
-
- In the general case, TS4 is set large enough so that a Client issuing a
- series of closely-spaced Requests to the same Server reuses the same
- state record at the Server end and thus does not incur the overhead of
- recreating this state. (The Server can recreate the state for a Client
- by performing a Probe on the Client to get the needed information.) It
- should also be set low enough so that the transaction identifier cannot
- wrap around and so that the Server does not run out of CSR's. We
- suggest a value in the range of 500 milliseconds. However, if the
- Server accepts non-idempotent Requests from this Client without doing a
- Probe on the Client, the TS4 value for this CSR is set to at least 4
- times the maximum packet lifetime.
-
- TS5 is TS3 plus the expected time for transmission and reception of the
- Response. We suggest that the latter be calculated as 3 times the
- transmission time for the Response data, allowing time for reception,
- processing and transmission of an acknowledgment at the Client end. A
- sophisticated implementation may refine this estimate further over time
- by timing acknowledgments to Responses.
-
-
- 2.5.6. Rate Control
-
- VMTP is designed to deal with the present and future problem of packet
- overruns. We expect overruns to be the major cause of dropped packets
- in the future. A client is expected to estimate and adjust the
- interpacket gap times so as to not overrun a server or intermediate
- nodes. The selective retransmission mechanism allows the server to
- indicate that it is being overrun (or some intermediate point is being
- overrun). For example, if the server requests retransmission of every
- Kth block, the client should assume overrun is taking place and increase
- the interpacket gap times. The client passes the server an indication
- of the interpacket gap desired for a response. The client may have to
- increase the interval because packets are being dropped by an
- intermediate gateway or bridge, even though it can handle a higher rate.
- A conservative policy is to increase the interpacket gap whenever a
- packet is lost as part of a multi-packet packet group.
-
-
- Cheriton [page 18]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- The provision of selective retransmission allows the rate of the client
- and the server to "push up" against the maximum rate (and thus lose
- packets) without significant penalty. That is, every time that packet
- transmission exceeds the rate of the channel or receiver, the recovery
- cost to retransmit the dropped packets is generally far less than
- retransmitting from the first dropped packet.
-
- The interpacket gap is expressed in 1/32nd's of the MTU packet
- transmission time. The minimum interpacket gap is 0 and the maximum gap
- that can be described in the protocol is 8 packet times. This places a
- limit on the slowest receivers that can be efficiently used on a
- network, at least those handling multi-packet Requests and Responses.
- This scheme also limits the granularity of adjustment. However, the
- granularity is relative to the speed of the network, as opposed to an
- absolute time. For entities on different networks of significantly
- different speed, we assume the interconnecting gateways can buffer
- packets to compensate<2>. With different network speeds and intermediary
- nodes subject to packet loss, a node must adjust the interpacket gap
- based on packet loss. The interpacket gap parameter may be of limited
- use.
-
-
- 2.6. Security
-
- VMTP provides an (optional) secure mode that protects against the usual
- security threats of peeking, impostoring, message tampering and replays.
- Secure VMTP must be used to guarantee any of the transport-level
- reliability properties unless it is guaranteed that there are no
- intruders or agents that can modify packets and update the packet
- checksums. That is, non-secure VMTP provides no guarantees in the
- presence of an intelligent intruder.
-
- The design closely follows that described by Birrell [1]. Authenticated
- information about a remote entity, including an encryption/decryption
- key, is obtained and maintained using a VMTP management operation, the
- authenticated Probe operation, which is executed as a non-secure VMTP
- message transaction. If a server receives a secure Request for which
- the server has no entity state, it sends a Probe request to the VMTP
-
- _______________
-
- <2> Gateways must also employ techniques to preserve or intelligently
- modify (if appropriate) the interpacket gaps. In particular, they must
- be sure not to arbitrarily remove interpacket gaps as a result of their
- forwarding of packets.
-
-
- Cheriton [page 19]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- management module of the client, "challenging" it to provide an
- authenticator that both authenticates the client as being associated
- with a particular principal as well as providing a key for
- encryption/decryption. The principal can include a real and effective
- principal, as used in UNIX <3>. Namely, the real principal is the
- principal on whose behalf the Request is being performed whereas the
- effective principal is the principal of the module invoking the request
- or remote procedure call.
-
- Peeking is prevented by encrypting every Request and Response packet
- with a working Key that is shared between Client and Server.
- Impostoring and replays are detected by comparing the Transaction
- identifier with that stored in the corresponding entity state record
- (which is created and updated by VMTP as needed). Message tampering is
- detected by encryption of the packet including the Checksum field. An
- intruder cannot update the checksum after modifying the packet without
- knowing the Key. The cost of fully encrypting a packet is close to the
- cost of generating a cryptographic checksum (and of course, encryption
- is needed in the general case), so there is no explicit provision for
- cryptographic checksum without packet encryption.
-
- A Client determines the Principal of the Server and acquires an
- authenticator for this Server and Principal using a higher level
- protocol. The Server cannot decrypt the authenticator or the Request
- packets unless it is in fact the Principal expected by the Client.
-
- An encrypted VMTP packet is flagged by the EPG bit in the VMTP packet
- header. Thus, encrypted packets are easily detected and demultiplexed
- from unencrypted packets. An encrypted VMTP packet is entirely
- encrypted except for the Client, Version, Domain, Length and Packet
- Flags fields at the beginning of the packet. Client identifiers can be
- assigned, changed and used to have no real meaning to an intruder or to
- only communicate public information (such as the host Internet address).
- They are otherwise just a random means of identification and
- demultiplexing and do not therefore divulge any sensitive information.
- Further secure measures must be taken at the network or data link levels
- if this information or traffic behavior is considered sensitive.
-
- VMTP provides multiple authentication domains as well as an encryption
- qualifier to accommodate different encryption algorithms and their
-
- _______________
-
- <3> Principal group membership must be obtained, if needed, by a
- higher level protocol.
-
-
- Cheriton [page 20]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- corresponding security/performance trade-offs. (See Appendix V.) A
- separate key distribution and authentication protocol is required to
- handle generation and distribution of authenticators and keys. This
- protocol can be implemented on top of VMTP and can closely follow the
- Birrell design as well.
-
- Security is optional in the sense that messages may be secure or
- non-secure, even between consecutive message transactions from the same
- client. It is also optional in that VMTP clients and servers are not
- required to implement secure VMTP (although they are required to respond
- intelligently to attempts to use secure VMTP). At worst, a Client may
- fail to communicate with a Server if the Server insists on secure
- communication and the Client does not implement security or vice versa.
- However, a failure to communicate in this case is necessary from a
- security standpoint.
-
-
- 2.7. Multicast
-
- The Server entity identifier in a message transaction can identify an
- entity group, in which case the Request is multicast to every Entity in
- this group (on a best-efforts basis). The Request is retransmitted
- until at least one Response is received (or an error timeout occurs)
- unless it is a datagram Request. The Client can receive multiple
- Responses to the Request.
-
- The VMTP service interface does not directly provide reliable multicast
- because it is expensive to provide, rarely needed by applications, and
- can be implemented by applications using the multiple Response feature.
- However, the protocol itself is adequate for reliable multicast using
- positive acknowledgments. In particular, a sophisticated Client
- implementation could maintain a list of members for each entity group of
- interest and retransmit the Request until acknowledged by all members.
- No modifications are required to the Server implementations.
-
- VMTP supports a simple form of subgroup addressing. If the CRE bit is
- set in a Request, the Request is delivered to the subgroup of entities
- in the Server group that are co-resident with one or more entities in
- the group (or individual entity) identified by the CoresidentEntity
- field of the Request. This is commonly used to send to the manager
- entity for a particular entity, where Server specifies the group of such
- managers. Co-resident means "using the same VMTP module", and logically
- on the same network host. In particular, a Probe request can be sent to
- the particular VMTP management module for an entity by specifying the
- VMTP management group as the Server and the entity in question as the
- CoResidentEntity.
-
-
-
- Cheriton [page 21]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- As an experimental aspect of the protocol, VMTP supports the Server
- sending a group Response which is sent to the Client as well as members
- of the destination group of Servers to which the original Request was
- sent. The MDG bit indicates whether the Client is a member of this
- group, allowing the Server module to determine whether separately
- addressed packet groups are required to send the Response to both the
- Client and the Server group. Normally, a Server accepts a group
- Response only if it has received the Request and not yet responded to
- the Client. Also, the Server must explicitly indicate it wants to
- accept group Responses. Logically, this facility is analogous to
- responding to a mail message sent to a distribution list by sending a
- copy of the Response to the distribution list.
-
-
- 2.8. Real-time Communication
-
- VMTP provides three forms of support for real-time communication, in
- addition to its standard facilities, which make it applicable to a wide
- range of real-time applications. First, a priority is transmitted in
- each Request and Response which governs the priority of its handling.
- The priority levels are intended to correspond roughly to:
-
- - urgent/emergency.
-
- - important
-
- - normal
-
- - background.
-
- with additional gradations for each level. The interpretation and
- implementation of these priority levels is otherwise host-specific, e.g.
- the assignment to host processing priorities.
-
- Second, datagram Requests allow the Client to send a datagram to another
- entity or entity group using the VMTP naming, transmission and delivery
- mechanism, but without blocking, retransmissions or acknowledgment.
- (The client can still request acknowledgment using the APG bit although
- the Server does not expect missing portions of a multi-packet datagram
- Request to be retransmitted even if some are not received.) A datagram
- Request in non-streamed mode supersedes all previous Requests from the
- same Client. A datagram Request in stream mode is queued (if necessary)
- after previous datagram Requests on the same stream. (See Section
- 2.11.)
-
- Finally, VMTP provides several control bit flags to modify the handling
- of Requests and Responses for real-time requirements. First, the
-
-
- Cheriton [page 22]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- conditional message delivery (CMD) flag causes a Request to be discarded
- if the recipient is not waiting for it when it arrives, similarly for
- the Response. This option allows a client to send a Request that is
- contingent on the server being able to process it immediately. The
- header checksum only (HCO) flag indicates that the checksum has been
- calculated only on the VMTP header and not on the data segment.
- Applications such as voice and video can avoid the overhead of
- calculating the checksum on data whose utility is insensitive to typical
- bit errors without losing protection on the header information.
- Finally, the No Retransmission (NRT) flag indicates that the recipient
- of a message should not ask for retransmission if part of the message is
- missing but rather either use what was received or discard it.
-
- None of these facilities introduce new protocol states. In fact, the
- total processing overhead in the normal case is a bit flag test for CMD,
- HCO or NRT plus assignment of priority on packet transmission and
- reception. (In fact, CMD and NRT are not tested in the normal case.)
- The additional code complexity is minimal. We feel that the overhead
- for providing these real-time facilities is minimal and that these
- facilities are both important and adequate for a wide class of real-time
- applications.
-
- Several of the normal facilities of VMTP appear useful for real-time
- applications. First, multicast is useful for distributed, replicated
- (fault-tolerant) real-time applications, allowing efficient state query
- and update for (for example) sensors and control state. Second, the DGM
- or idempotent flag for Responses has some real-time benefits, namely: a
- Request is redone to get the latest values when the Response is lost,
- rather than just returning the old values. The desirability of this
- behavior is illustrated by considering a request for the current time of
- day. An idempotent handling of this request gives better accuracy in
- returning the current time in the case that a retransmission is
- necessary. Finally, the request-response semantics (in the absence of
- streaming) of each new Request from a Client terminating the previous
- message transactions from that Client, if any, provides the "most recent
- is most important" handling of processing that most real-time
- applications require.
-
- In general, a key design goal of VMTP was provide an efficient
- general-purpose transport protocol with the features required for
- real-time communication. Further experience is required to determine
- whether this goal has been achieved.
-
-
-
-
-
-
-
- Cheriton [page 23]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 2.9. Forwarded Message Transactions
-
- A Server may invoke another Server to handle a Request. It is fairly
- common for the invocation of the second Server to be the last action
- performed by the first Server as part of handling the Request. For
- example, the original Server may function primarily to select a process
- to handle the Request. Also, the Server may simply check the
- authorization on the Request. Describing this situation in the context
- of RPC, a nested remote procedure call may be the last action in the
- remote procedure and the return parameters are exactly those of the
- nested call. (This situation is analogous to tail recursion.)
-
- As an optimization to support this case, VMTP provides a Forward
- operation that allows the server to send the nested Request to the other
- server and have this other server respond directly to the Client.
-
- If the message transaction being forwarded was not multicast, not secure
- or the two Servers are the same principal and the ForwardCount of the
- Request is less than the maximum forward count of 15, the Forward
- operation is implemented by the Server sending a Request onto the next
- Server with the forwarded Request identified by the same Client and
- Transaction as the original Request and a ForwardCount one greater than
- the Request received from the Client. In this case, the new Server
- responds directly to the Client. A forwarded Request is illustrated in
- the following figure.
-
- +---------+ Request +----------+
- | Client +---------------->| Server 1 |
- +---------+ +----------+
- ^ |
- | | forwarded Request
- | V
- | Response +----------+
- +----------------------| Server 2 |
- +----------+
-
- If the message transaction does not meet the above requirements, the
- Server's VMTP module issues a nested call and simply maps the returned
- Response to a Response to original Request without further Server-level
- processing. In this case, the only optimization over a user-level
- nested call is one fewer VMTP service operation; the VMTP module handles
- the return to the invoking call directly. The Server may also use this
- form of forwarding when the Request is part of a stream of message
- transactions. Otherwise, it must wait until the forwarded message
- transaction completes before proceeding with the subsequent message
- transactions in the stream.
-
-
-
- Cheriton [page 24]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Implementation of the user-level Forward operation is optional,
- depending on whether the server modules require this facility. Handling
- an incoming forwarded Request is a minor modification of handling a
- normal incoming Request. In particular, it is only necessary to examine
- the ForwardCount field when the Transaction of the Request matches that
- of the last message transaction received from the Client. Thus, the
- additional complexity in the VMTP module for the required forwarding
- support is minimal; the complexity is concentrated in providing a highly
- optimized user-level Forward primitive, and that is optional.
-
-
- 2.10. VMTP Management
-
- VMTP management includes operations for creating, deleting, modifying
- and querying VMTP entities and entity groups. VMTP management is
- logically implemented by a VMTP management server module that is invoked
- using a message transaction addressed to the Server, VMTP_MANAGER_GROUP,
- a well-known group entity identifier, in conjunction with Coresident
- Entity mechanism introduced in Section 2.7. A particular Request may
- address the local module, the module managing a particular entity, the
- set of modules managing those entities contained in a specific group or
- all management modules, as appropriate.
-
- The VMTP management procedures are specified in Appendix III.
-
-
- 2.11. Streamed Message Transactions
-
- Streamed message transactions refer to two or more message transactions
- initiated by a Client before it receives the response to the first
- message transaction, with each transaction being processed and responded
- to in order but asynchronous relative to the initiation of the
- transactions. A Client streams messages transactions, and thereby has
- multiple message transactions outstanding, by sending them as part of a
- single run of message transactions. A run of message transactions is a
- sequence of message transactions with the same Client and Server and
- consecutive Transaction identifiers, with all but the first and last
- Requests and Responses flagged with the NSR (Not Start Run) and NER
- (Not End Run) control bits. (Conversely, the first Request and
- Response does not have the NSR set and the last Request and Response
- does not have the NER bit set.) The message transactions in a run use
-
-
-
-
-
-
-
-
- Cheriton [page 25]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- consecutive transaction identifiers (except if the STI bit <4> is used
- in one, in which case the transaction identifier for the next message
- transaction is 256 greater, rather than 1).
-
- The Client retains a record for each outstanding transaction until it
- gets a Response or is timed out in error. The record provides the
- information required to retransmit the Request. On retransmission
- timeout, the client retransmits the last Request for which it has not
- received a Response the same as is done with non-streamed communication.
- (I.e. there need be only one timeout for all the outstanding message
- transactions associated with a single client.)
-
- The consecutive transaction identifiers within a run of message
- transactions are used as sequence numbers for error control. The Server
- handles each message transaction in the sequence specified by its
- transaction identifier. When it receives a message transaction that is
- not marked as the beginning of a run, it checks that it previously
- received a message transaction with the predecessor transaction
- identifier, either 1 less than the current one or 256 less if the
- previous one had the STI bit set. If not, the Server sends a
- NotifyVmtpClient operation to the Client's manager indicating either:
- (1) the first message transaction was not fully received, or else (2) it
- has no record of the last one received. If the NRT control flag is set,
- it does not await nor expect retransmission but proceeds with handling
- this Request. This flag is used primarily when datagram Requests are
- used as part of a stream of message transactions. If NRT was not
- specified, the Client must retransmit from the first message transaction
- not fully received (either at all or in part) before the Server can
- proceed with handling this run of Requests or else restart the run of
- message transactions.
-
- The Client expects to receive the Responses in a consecutive sequence,
- using the Transaction identifier to detect missing Responses. Thus, the
- Server must return Responses in sequence except possibly for some gaps,
- as follows. The Server can specify in the PGcount field in a Response,
- the number of consecutively previous Responses that this Response
-
-
-
-
- _______________
-
- <4> The STI bit is used by the Client to effectively allocate 255
- transaction identifiers for use by the Server in returning a large
- Response or stream of Responses.
-
-
- Cheriton [page 26]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- corresponds to, up to a maximum of 255 previous Responses <5>. Thus,
- for example, a Response with Transaction identifier 46 and PGcount 3
- represents Responses 43, 44, 45 and 46. This facility allows the Server
- to eliminate sending Responses to Requests that require no Response,
- effectively batching the Responses into one. It also allows the Server
- to effectively maintain strictly consecutive sequencing when the Client
- has skipped 256 Transaction identifiers using the STI bit and the Server
- does not have that many Responses to return.
-
- If the Client receives a Response that is not consecutive, it
- retransmits the Request(s) for which the Response(s) is/are missing
- (unless, of course, the corresponding Requests were sent as datagrams).
- The Client should wait at the end of a run of message transactions for
- the last one to complete.
-
- When a Server receives a Request with the NSR bit clear and a higher
- transaction identifier than it currently has for the Client, it
- terminates all processing and discards Responses associated with the
- previous Requests. Thus, a stream of message transactions is
- effectively aborted by starting a new run, even if the Server was in the
- middle of handling the previous run.
-
- Using a mixture of datagram and normal Requests as part of a stream of
- message transactions, particularly with the use of the NRT bit, can lead
- to complex behavior under packet loss. It is recommended that a run of
- message transactions be all of one type to avoid problems, i.e. all
- normal or all datagrams. Finally, when a Server forwards a Request that
- is part of a run, it must suspend further processing of the subsequent
- Requests until the forwarded Request has been handled, to preserve order
- of processing. The simplest handling of this situation is to use a real
- nested call when forwarding with streamed message transactions.
-
- Flow control of streamed message transactions relies on rate control at
- the Client plus receipt (or non-receipt) of management notify operations
- indicating the presence of overrunning. A Client must reduce the number
- of outstanding message transactions at the Server when it receives a
- NotifyVmtpServer operation with the MSGTRANS_OVERFLOW ResponseCode. The
- transact parameter indicates the last packet group that was accepted.
-
-
- _______________
-
- <5> PGcount actually corresponds to packet groups which are described
- in Section 2.13. This (simplified) description is accurate when there
- is one Request or Response per packet group.
-
-
- Cheriton [page 27]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- The implementation of multiple outstanding message transactions requires
- the ability to record, timeout and buffer multiple outstanding message
- transactions at the Client end as well as the Server end. However, this
- facility is optional for both the Client and the Server. Client systems
- with heavy-weight processes and high network access cost are most likely
- to benefit from this facility. Servers that serve a wide variety of
- client machines should implement streaming to accommodate these types of
- clients.
-
-
- 2.12. Fault-Tolerant Applications
-
- One approach to fault-tolerant systems is to maintain a log of all
- messages sent at each node and replay the messages at a node when the
- node fails, after restarting it from the last checkpoint <6>. As an
- experimental facility, VMTP provides a Receive Sequence Number field in
- the NotifyVmtpClient and NotifyVmtpServer operations as well as the Next
- Receive Sequence (NRS) flag in the Response packet to allow a sender to
- log a receive sequence number with each message sent, allowing the
- packets to be replayed at a recovering node in the same sequence as they
- were originally received, thereby recovering to the same state as
- before.
-
- Basically, each sending node maintains a receive sequence number for
- each receiving node. On sending a Request to a node, it presume that
- the receive sequence number is one greater than the one it has recorded
- for that node. If not, the receiving node sends a notify operation
- indicating the receive sequence number assigned the Request. The NRS in
- the Response confirms that the Request message was the next receive
- sequence number, so the sender can detect if it failed to receive the
- notify operation in the previous case. With Responses, the packets are
- ordered by the Transaction identifier except for multicast message
- transactions, in which there may be multiple Responses with the same
- identification. In this case, NotifyVmtpServer operations are used to
- provide receive sequence numbers.
-
- This experimental extension of the protocol is focused on support for
- fault-tolerant real-time distributed systems required in various
- critical applications. It may be removed or extended, depending on
- further investigations.
-
- _______________
-
- <6> The sender-based logging is being investigated by Willy Zwaenepoel
- of Rice University.
-
-
- Cheriton [page 28]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 2.13. Packet Groups
-
- A message (whether Request or Response) is sent as one or more packet
- groups. A packet group is one or more packets, each containing the same
- transaction identification and message control block. Each packet is
- formatted as below with the message control block logically embedded in
- the VMTP header.
-
- +------------------------------------++---------------------+
- | VMTP Header || |
- +------------+-----------------------|| segment data |
- |VMTP Control| Message Control Block || |
- +------------+-----------------------++---------------------+
-
- The some fields of the VMTP control portion of the packet and data
- segment portion can differ between packets within the same packet group.
-
- The segment data portion of a packet group represents up to 16
- kilooctets of the segment specified in the message control block. The
- portion contained in each packet is indicated by the PacketDelivery
- field contained in the VMTP header. The PacketDelivery field as a bit
- mask has a similar interpretation to the MsgDelivery field in that each
- bit corresponds to a segment data block of 512 octets. The
- PacketDelivery field limits a packet group to 16 kilooctets and a
- maximum of 32 VMTP packets (with a minimum of 1 packet). Data can be
- sent in fewer packets by sending multiple data blocks per packet. We
- require that the underlying datagram service support delivery of (at
- minimum) the basic 580 octet VMTP packet <7>. To illustrate the use of
- the PacketDelivery field, consider for example the Ethernet which has a
- MTU of 1536 octets. so one would send 2 512-octet segment data blocks
- per packet. (In fact, if a third block is last in the segment and less
- than 512 octets and fits in the packet without making it too big, an
- Ethernet packet could contain three data blocks. Thus, an Ethernet
- packet group for a segment of size 0x1D00 octets (14.5 blocks) and
- MsgDelivery 0x000074FF consists of 6 packets indicated as follows <8>.
-
- _______________
-
- <7> Note that with a 20 octet IP header, a VMTP packet is 600
- octets. We propose the convention that any host implementing VMTP
- implicitly agrees to accept IP/VMTP packets of at least 600 octets.
-
- <8> We use the C notation 0xHHHH to represent a hexadecimal number.
-
-
- Cheriton [page 29]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Packet
- Delivery 1 1 1 1 1 1 1 1 0 0 1 0 1 0 1 0 0 0 0 0 0 . . .
- 0000 0400 0800 0C00 1000 1400 1800 1C00
- +----+----+----+----+----+----+----+-+
- Segment |....|....|....|....|....|....|....|.|
- +----+----+----+----+----+----+----+-+
- : : : : : : : / / :
- v v v v v v v /| v
- +----+----+----+----+ +----+ +---+
- Packets | 1 | 2 | 3 | 4 | | 5 | | 6 |
- +----+----+----+----+ +----+ +---+
-
- Each '.' is 256 octets of data. The PacketDelivery masks for the 6
- packets are: 0x00000003, 0x0000000C, 0x00000030, 0x000000C0, 0x00001400
- and 0x00006000, indicating the segment blocks contained in each of the
- packets. (Note that the delivery bits are in little endian order.)
-
- A packet group is sent as a single "blast" of packets with no explicit
- flow control. However, the sender should estimate and transmit at a
- rate of packet transmission to avoid congesting the network or
- overwhelming the receiver, as described in Section 2.5.6. Packets in a
- packet group can be sent in any order with no change in semantics.
-
- When the first packet of a packet group is received (assuming the Server
- does not decide to discard the packet group), the Server saves a copy of
- the VMTP packet header, indicates it is currently receiving a packet
- group, initializes a "current delivery mask" (indicating the data in the
- segment received so far) to 0, accepts this packet (updating the current
- delivery mask) and sets the timer for the packet group. Subsequent
- packets in the packet group update the current delivery mask.
-
- Reception of a packet group is terminated when either the current
- delivery mask indicates that all the packets in the packet group have
- been received or the packet group reception timer expires (set to TC3 or
- TS1). If the packet group reception timer expires, if the NRT bit is
- set in the Control flags then the packet group is discarded if not
- complete unless MDM is set. In this case, the MsgDelivery field in the
- message control block is set to indicate the segment data blocks
- actually received and the message control block and segment data
- received is delivered to application level.
-
- If NRT is not set and not all data blocks have been received, a
- NotifyVmtpClient (if a Request) or NotifyVmtpServer (if a Response) is
- sent back with a PacketDelivery field indicating the blocks received.
- The source of the packet group is then expected to retransmit the
- missing blocks. If not all blocks of a Request are received after
- RequestAckRetries(Client) retransmissions, the Request is discarded and
-
-
- Cheriton [page 30]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- a NotifyVmtpClient operation with an error response code is sent to the
- client's manager unless MDM is set. With a Response, there are
- ResponseAckRetries(Server) retransmissions and then, if MDM is not set,
- the requesting entity is returned the message control block with an
- indication of the amount of segment data received extending contiguously
- from the start of the segment. E.g. if the sender sent 6 512-octet
- blocks and only the first two and the last two arrived, the receiver
- would be told that 1024 octets were received. The ResponseCode field is
- set to BAD_REPLY_SEGMENT. (Note that VMTP is only able to indicate the
- specific segment blocks received if MDM is set.)
-
- The parameters RequestAckRetries(Client) and ResponseAckRetries(Server)
- could be set on a per-client and per-server basis in a sophisticated
- implementation based on knowledge of packet loss.
-
- If the APG flag is set, a NotifyVmtpClient or NotifyVmtpServer
- operation is sent back at the end of the packet group reception,
- depending on whether it is a Request or a Response.
-
- At minimum, a Server should check that each packet in the packet group
- contains the same Client, Server, Transaction identifier and SegmentSize
- fields. It is a protocol error for any field other than the Checksum,
- packet group control flags, Length and PacketDelivery in the VMTP header
- to differ between any two packets in one packet group. A packet group
- containing a protocol error of this nature should be discarded.
-
- Notify operations should be sent (or invoked) in the manager whenever
- there is a problem with a unicast packet. i.e. negative acknowledgments
- are always sent in this case. In the case of problems with multicast
- packets, the default is to send nothing in response to an error
- condition unless there is some clear reason why no other node can
- respond positively. For example, the packet might be a Probe for an
- entity that is known to have been recently existing on the receiving
- host but now invalid and could not have migrated. In this case, the
- receiving host responds to the Probe indicating the entity is
- nonexistent, knowing that no other host can respond to the Probe. For
- packets and packet groups that are received and processed without
- problems, a Notify operation is invoked only if the APG bit is set.
-
-
- 2.14. Runs of Packet Groups
-
- A run of packet groups is a sequence of packet groups, all Request
- packets or all Response packets, with the same Client and consecutive
- transaction identifiers, all but the first and last packets flagged with
- the NSR (Not Start Run) and NER (Not End Run) control bits. When each
- packet group in the run corresponds to a single Request or Response, it
-
-
- Cheriton [page 31]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- is identical to a run of message transactions. (See Section 2.11)
- However, a Request message or a Response message may consists of up to
- 256 packet groups within a run, for a maximum of 4 megaoctets of segment
- data. A message that is continued in the next packet group in the run
- is flagged in the current packet group by the CMG flag. Otherwise, the
- next packet group in the run (if any) is treated as a separate Request
- or Response.
-
- Normally, each Request and Response message is sent as a single packet
- group and each run consists of a single packet group. In this case
- neither NSR or NER are set. For multi-packet group messages, the
- PacketDelivery mask in the i-th packet group of a message corresponds to
- the portion of the segment offset by i-1 times 16 kilooctets,
- designating the the first packet group to have i = 1.
-
-
- 2.15. Byte Order
-
- For purposes of transmission and reception, the MCB is treated as
- consisting of 8 32-bit fields and the segment is a sequence of bytes.
- VMTP transmits the MCB in big-endian order, performing byte-swapping, if
- necessary, before transmission. A little-endian host must byte-swap the
- MCB on reception. (The data segment is transmitted as a sequence of
- bytes with no reordering.) The byte order of the sender of a message is
- indicated by the LEE bit in the entity identifier for the sender, the
- Client field if a Request and the Server field if a Response. The
- sender and receiver of a message are required to agree in some higher
- level protocol (such as an RPC presentation protocol) on who does
- further swapping of the MCB and data segment if required by the types of
- the data actually being transmitted. For example, the segment data may
- contain a record with 8-bit, 16-bit and 32-bit fields, so additional
- transformation is required to move the segment from a host of one byte
- order to another.
-
- VMTP to date has used a higher-level presentation protocol in which
- segment data is sent in the native order of the sending host and
- byte-swapped as necessary by the receiving host. This approach
- minimizes the byte-swapping overhead between machines of common byte
- order (including when the communication is transparently local to one
- host), avoids a strong bias in the protocol to one byte-order, and
- allows for the sending entity to be sending to a group of hosts with
- different byte orders. (Note that the byte-swap overhead for the MCB is
- minimal.) The presentation-level overhead is minimal because most
- common operations, such as file access operations, have parameters that
- fit the MCB and data segment data types exactly.
-
-
-
-
- Cheriton [page 32]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 2.16. Minimal VMTP Implementation
-
- A minimal VMTP client needs to be able to send a Request packet group
- and receive a Response packet group as well as accept and respond to
- Requests sent to its management module, including Probe and NotifyClient
- operations. It may also require the ability to invoke Probe and Notify
- operations to locate a Server and acknowledge responses. (the latter
- only if it is involved in transactions that are not idempotent or
- datagram message transactions. However, a simple sensor, for example,
- can transmit VMTP datagram Requests indicating its current state with
- even less mechanism.) The minimal client thus requires very little code
- and is suitable as a basis for (e.g.) a network boot loader.
-
- A minimal VMTP server implements idempotent, non-encrypted message
- transactions, possibly with no segment data support. It should use an
- entity state record for each Request but need only retain it while
- processing the Request. Without segment data larger than a packet,
- there is no need for any timers, buffering (outside of immediate request
- processing) or queuing. In particular, it needs only as many records as
- message transactions it handles simultaneously (e.g. 1). The entity
- state record is required to recognize and respond to Request
- retransmissions during request processing.
-
- The minimal server need only receive Requests and and be able to send
- Response packets. It need have only a minimal management module
- supporting Probe operations. (Support for the NotifyVmtpClient
- operation is only required if it does not respond immediately to a
- Request.) Thus the VMTP support for say a time server, sensor, or
- actuator can be extremely simple. Note that the server need never issue
- a Probe operation if it uses the host address of the Request for the
- Response and does not require the Client information returned by the
- Probe operation. The minimal server should also support reception of
- forwarded Requests.
-
-
- 2.17. Message vs. Procedural Request Handling
-
- A request-response protocol can be used to implement two forms of
- semantics on reception. With procedural handling of a Request, a
- Request is handled by a process associated with the Server that
- effectively takes on the identity of the calling process, treating the
- Request message as invoking a procedure, and relinquishing its
- association to the calling process on return. VMTP supports multiple
- nested calls spanning multiple machines. In this case, the distributed
- call stack that results is associated with a single process from the
- standpoint of authentication and resource management, using the
- ProcessId field supported by VMTP. The entity identifiers effectively
-
-
- Cheriton [page 33]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- link these call frames together. That is, the Client field in a Request
- is effectively the return link to the previous call frame.
-
- With message handling of a Request, a Request message is queued for a
- server process. The server process dequeues, reads, processes and
- responds to the Request message, executing as a separate process.
- Subsequent Requests to the same server are queued until the server asks
- to receive the next Request.
-
- Procedural semantics have the advantage of allowing each Request (up to
- the resource limits of the Server) to execute concurrently at the
- Server, with Request-specific synchronization. Message semantics have
- the advantage that Requests are serialized at the Server and that the
- request processing logically executes with the priority, protection and
- independent execution of a separate process. Note that procedural and
- message handling of a request appear no differently to the client
- invoking the message transaction, except possibly for differences in
- performance.
-
- We view the two Request handling approaches as appropriate under
- different circumstances. VMTP supports both models.
-
-
- 2.18. Bibliography
-
- The basic protocol is similar to that used in the original form of the V
- kernel [3, 4] as well as the transport protocol of Birrell and
- Nelson's [2] remote procedure call mechanism. An earlier version of the
- protocol was described in SIGCOMM'86 [6]. The rate-based flow control
- is similar to the techniques of Netblt [9]. The support for idempotency
- draws, in part, on the favorable experience with idempotency in the V
- distributed system. Its use was originally inspired by the Woodstock
- File Server [11]. The multicast support draws on the multicast
- facilities in V [5] and is designed to work with, and is now implemented
- using, the multicast extensions to the Internet [8] described in RFC 966
- and 988. The secure version of the protocol is similar to that
- described by Birrell [1] for secure RPC. The use of runs of packet
- groups is similar to Fletcher and Watson's delta-T protocol [10]. The
- use of "management" operations implemented using VMTP in place of
- specialized packet types is viewed as part of a general strategy of
- using recursion to simplify protocol architectures [7].
-
- Finally, this protocol was designed, in part, to respond to the
- requirements identified by Braden in RFC 955. We believe that VMTP
- satisfies the requirements stated in RFC 955.
-
-
-
-
- Cheriton [page 34]
-
-
-
- RFC 1045 VMTP February 1988
-
-
-
- [1] A.D. Birrell, "Secure Communication using Remote Procedure
- Calls", ACM. Trans. on Computer Systems 3(1), February, 1985.
-
-
- [2] A. Birrell and B. Nelson, "Implementing Remote Procedure Calls",
- ACM Trans. on Computer Systems 2(1), February, 1984.
-
-
- [3] D.R. Cheriton and W. Zwaenepoel, "The Distributed V Kernel and its
- Performance for Diskless Workstations", In Proceedings of the 9th
- Symposium on Operating System Principles, ACM, 1983.
-
-
- [4] D.R. Cheriton, "The V Kernel: A Software Base for Distributed
- Systems", IEEE Software 1(2), April, 1984.
-
-
- [5] D.R. Cheriton and W. Zwaenepoel, "Distributed Process Groups in
- the V Kernel", ACM Trans. on Computer Systems 3(2), May, 1985.
-
-
- [6] D.R. Cheriton, "VMTP: A Transport Protocol for the Next
- Generation of Communication Systems", In Proceedings of
- SIGCOMM'86, ACM, Aug 5-7, 1986.
-
-
- [7] D.R. Cheriton, "Exploiting Recursion to Simplify an RPC
- Communication Architecture", in preparation, 1988.
-
-
- [8] D.R. Cheriton and S.E. Deering, "Host Groups: A Multicast
- Extension for Datagram Internetworks", In 9th Data Communication
- Symposium, IEEE Computer Society and ACM SIGCOMM, September, 1985.
-
-
- [9] D.D. Clark and M. Lambert and L. Zhang, "NETBLT: A Bulk Data
- Transfer Protocol", Technical Report RFC 969, Defense Advanced
- Research Projects Agency, 1985.
-
-
- [10] J.G. Fletcher and R.W. Watson, "Mechanism for a Reliable Timer-
- based Protocol", Computer Networks 2:271-290, 1978.
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 35]
-
-
-
- RFC 1045 VMTP February 1988
-
-
-
-
- [11] D. Swinehart and G. McDaniel and D. Boggs, "WFS: A Simple File
- System for a Distributed Environment", In Proc. 7th Symp.
- Operating Systems Principles, 1979.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 36]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 3. VMTP Packet Formats
-
- VMTP uses 2 basic packet formats corresponding to Request packets and
- Response packets. These packet formats are identical in most of the
- fields to simplify the implementation.
-
- We first describe the entity identifier format and the packet fields
- that are used in general, followed by a detailed description of each of
- the packet formats. These fields are described below in detail. The
- individual packet formats are described in the following subsections.
- The reader and VMTP implementor may wish to refer to Chapters 4 and 5
- for a description of VMTP event handling and only refer to this detailed
- description as needed.
-
-
- 3.1. Entity Identifier Format
-
- The 64-bit non-group entity identifiers have the following substructure.
-
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |R| |L|R|
- |A|0|E|E| Domain-specific structure
- |E| |E|S|
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- Domain-specific structure |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
- The field meanings are as follows:
-
- RAE Remote Alias Entity - the entity identifier identifies
- an entity that is acting as an alias for some entity
- outside this entity domain. This bit is used by
- higher-level protocols. For instance, servers may take
- extra security and protection measures with aliases.
-
- GRP Group - 0, for non-group entity identifiers.
-
- LEE Little-Endian Entity - the entity transmits data in
- little-endian (VAX) order.
-
- RES Reserved - must be 0.
-
- The 64-bit entity group identifiers have the following substructure.
-
-
-
-
- Cheriton [page 37]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |R| |U|R|
- |A|1|G|E| Domain-specific structure
- |E| |P|S|
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- Domain-specific structure |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
- The field meanings are as follows:
-
- RAE Remote Alias Entity - same as for non-group entity
- identifier.
-
- GRP Group - 1, for entity group identifiers.
-
- UGP Unrestricted Group - no restrictions are placed on
- joining this group. I.e. any entity can join limited
- only by implementation resources.
-
- RES Reserved - must be 0.
-
- The all-zero entity identifier is reserved and guaranteed to be
- unallocated in all domains. In addition, a domain may reserve part of
- the entity identifier space for statically allocated identifiers.
- However, this is domain-specific.
-
- Description of currently defined entity identifier domains is provided
- in Appendix IV.
-
-
- 3.2. Packet Fields
-
- Client 64-bit identifier for the client entity associated with
- this packet. The structure, allocation and binding of
- this identifier is specific to the specified Domain. An
- entity identifier always includes 4 types bits as
- specified in Section 3.1.
-
- Version The 3-bit identifier specifying the version of the
- protocol. Current version is version 0.
-
- Domain The 13-bit identifier specifying the naming and
- administration domain for the client and server named in
- the packet.
-
-
-
- Cheriton [page 38]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Packet Flags: 3 bits. (The normal case has none of the flags set.)
-
- HCO Header checksum only - checksum has only been calculated
- on the header. This is used in some real-time
- applications where the strict correctness of the data is
- not needed.
-
- EPG Encrypted packet group - part of a secure message
- transaction.
-
- MPG Multicast packet group - packet was multicast on
- transmission.
-
- Length A 13-bit field that specifies the number of 32-bit words
- in the segment data portion of the packet (if any),
- excluding the checksum field. (Every VMTP packet is
- required to be a multiple of 64 bits, possibly by
- padding out the segment data.) The minimum legal Length
- is 0, the maximum length is 4096 and it must be an even
- number.
-
- Control Flags: 9 bits. (The normal case has none of the flags set.)
-
- NRS Next Receive Sequence - the associated Request message
- (in a Response) or previous Response (if a Request) was
- received consecutive with the last Request from this
- entity. That is, there was no interfering messages
- received.
-
- APG Acknowledge Packet Group - Acknowledge packet group on
- receipt. If a Request, send back a Request to the
- client's manager providing an update on the state of the
- transaction as soon as the request packet group is
- received, independent of the response being available.
- If a Response, send an update to the server's manager as
- soon as possible after response packet group is received
- providing an update on the state of the transaction at
- the client
-
- NSR Not Start Run - 1 if this packet is not part of the
- first packet group of a run of packet groups.
-
- NER Not End Run - 1 if this packet is not part of the last
- packet group of a run of packet groups.
-
- NRT No Retransmission - do not ask for retransmissions of
- this packet group if not all received within timeout
-
-
- Cheriton [page 39]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- period, just deliver or discard.
-
- MDG Member of Destination Group - this packet is sent to a
- group and the client is a member of this group.
-
- CMG Continued Message - the message (Request or Response) is
- continued in the next packet group. The next packet
- group has to be part of the same run of packet groups.
-
- STI Skip Transaction Identifiers - the next transaction
- identifier that the Client plans to use is the current
- transaction plus 256, if part of the same run and at
- least this big if not. In a Request, this authorizes
- the Server to send back up to 256 packet groups
- containing the Response.
-
- DRT Delay Response Transmission - set by request sender if
- multiple responses are expected (as indicated by the MRD
- flag in the RequestCode) and it may be overrun by
- multiple responses. The responder(s) should then
- introduce a short random delay in sending the Response
- to minimize the danger of overrunning the Client. This
- is normally only used for responding to multicast
- Requests where the Client may be receiving a large
- number of Responses, as indicated by the MRD flag in the
- Request flags. Otherwise, the Response is sent
- immediately.
-
- RetransmitCount:
- 3 bits - the ordinal number of transmissions of this
- packet group prior to this one, modulo 8. This field is
- used in estimation of roundtrip times. This count may
- wrap around during a message transaction. However, it
- should be sufficient to match acknowledgments and
- responses with a particular transmission.
-
- ForwardCount: 4 bits indicating the number of times this Request has
- been forwarded. The original Request is always sent
- with a ForwardCount of 0.
-
- Interpacket Gap: 8 bits.
- Indicates the recommended time to use between subsequent
- packet transmissions within a multi-packet packet group
- transmission. The Interpacket Gap time is in 1/32nd of
- a network packet transmission time for a packet of size
- MTU for the node. (Thus, the maximum gap time is 8
- packet times.)
-
-
- Cheriton [page 40]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- PGcount: 8 bits
- The number of packet groups that this packet group
- represents in addition to that specified by the
- Transaction field. This is used in acknowledging
- multiple packet groups in streamed communication.
-
- Priority 4-bit identifier for priority for the processing of this
- request both on transmission and reception. The
- interpretation is:
-
- 1100 urgent/emergency
-
- 1000 important
-
- 0000 normal
-
- 0100 background
-
- Viewing the higher-order bit as a sign bit (with 1
- meaning negative), low values are high priority and high
- values are low priority. The low-order 2 bits indicate
- additional (lower) gradations for each level.
-
- Function Code: 1 bit - types of VMTP packets. If the low-order bit of
- the function code is 0, the packet is sent to the
- Server, else it is sent to the Client.
-
- 0 Request
-
- 1 Response
-
- Transaction: 32 bits:
- Identifier for this message transaction.
-
- PacketDelivery: 32 bits:
- Delivery indicates the segment blocks contained in this
- packet. Each bit corresponds to one 512-octet block of
- segment data. A 1 bit in the i-th bit position
- (counting the LSB as 0) indicates the presence of the
- i-th segment block.
-
- Server: 64 bits
- Entity identifier for the server or server group
- associated with this transaction. This is the receiver
- when a Request packet and the sender when a Response
- packet.
-
-
-
- Cheriton [page 41]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Code: 32 bits The Request Code and Response Code, set either at the
- user level or VMTP level depending on use and packet
- type. Both the Request and Response codes include 8
- high-order bits from the following set of control bits:
-
- CMD Conditional Message Delivery - only deliver the request
- or response if the receiving entity is waiting for it at
- the time of delivery, otherwise drop the message.
-
- DGM DataGram Message - indicates that the message is being
- sent as a datagram. If a Request message, do not wait
- for reply, or retransmit. If a Response message, treat
- this message transaction as idempotent.
-
- MDM Message Delivery Mask - indicates that the MsgDelivery
- field is being used. Otherwise, the MsgDelivery field
- is available for general use.
-
- SDA Segment Data Appended - segment data is appended to the
- message control block, with the total size of the
- segment specified by the SegmentSize field. Otherwise,
- the segment data is null and the SegmentSize field is
- not used by VMTP and available for user- or RPC-level
- uses.
-
- CRE CoResident Entity - indicates that the CoResidentEntity
- field in the message should be interpreted by VMTP.
- Otherwise, this field is available for additional user
- data.
-
- MRD Multiple Responses Desired - multiple Responses are
- desired to to this Request if it is multicast.
- Otherwise, the VMTP module can discard subsequent
- Responses after the first Response.
-
- PIC Public Interface Code - Values for Code with this bit
- set are reserved for definition by the VMTP
- specification and other standard protocols defined on
- top of VMTP.
-
- RES Reserved for future use. Must be 0.
-
- CoResidentEntity
- 64-bit Identifier for an entity or group of entities
- with which the Server entity or entities must be
- co-resident, i.e. route only to entities (identified by
- Server) on the same host(s) as that specified by
-
-
- Cheriton [page 42]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- CoResidentEntity, Only meaningful if CRE is set in the
- Code field.
-
- User Data 12 octets Space in the header for the VMTP user to
- specify user-specific control and data.
-
- MsgDelivery: 32 bits
- The segment blocks being transmitted (in total) in this
- packet group following the conventions for the
- PacketDelivery field. This field is ignored by the
- protocol and treated as an additional user data field if
- MDM is 0. On transmission, the user level sets the
- MsgDelivery to indicate those portions of the segment to
- be transmitted. On receipt, the MsgDelivery field is
- modified by the VMTP module to indicate the segment data
- blocks that were actually received before the message
- control block is passed to the user or RPC level. In
- particular, the kernel does not discard the packet group
- if segment data blocks are missing. A Server or Client
- entity receiving a message with a MsgDelivery in use
- must check the field to ensure adequate delivery and
- retry the operation if necessary.
-
- SegmentSize: 32 bits
- Size of segment in octets, up to a maximum of 16
- kilooctets without streaming and 4 megaoctets with
- streaming, if SDA is set. Otherwise, this field is
- ignored by the protocol and treated as an additional
- user data field.
-
- Segment Data: 0-16 kilooctets
- 0 octets if SDA is 0, else the portion of the segment
- corresponding to the Delivery Mask, limited by the
- SegmentSize and the MTU, padded out to a multiple of 64
- bits.
-
- Checksum: 32 bits.
- The 32-bit checksum for the header and segment data.
-
-
- The VMTP checksum algorithm <9> develops a 32-bit checksum by computing
-
- _______________
-
- <9> This algorithm and description are largely due to Steve Deering of
- Stanford University.
-
-
- Cheriton [page 43]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- two 16-bit, ones-complement sums (like IP), each covering different
- parts of the packet. The packet is divided into clusters of 16 16-bit
- words. The first, third, fifth,... clusters are added to the first sum,
- and the second, fourth, sixth,... clusters are added to the second sum.
- Addition stops at the end of the packet; there is no need to pad out to
- a cluster boundary (although it is necessary that the packet be an
- integral multiple of 64 bits; padding octets may have any value and are
- included in the checksum and in the transmitted packet). If either of
- the resulting sums is zero, it is changed to 0xFFFF. The two sums are
- appended to the transmitted packet, with the first sum being transmitted
- first. Four bytes of zero in place of the checksum may be used to
- indicate that no checksum was computed.
-
- The 16-bit, ones-complement addition in this algorithm is the same as
- used in IP and, therefore, subject to the same optimizations. In
- particular, the words may be added up 32-bits at a time as long as the
- carry-out of each addition is added to the sum on the following
- addition, using an "add-with-carry" type of instruction. (64-bit or
- 128-bit additions would also work on machines that have registers that
- big.)
-
- A particular weakness of this algorithm (shared by IP) is that it does
- not detect the erroneous swapping of 16-bit words, which may easily
- occur due to software errors. A future version of VMTP is expected to
- include a more secure algorithm, but such an algorithm appears to
- require hardware support for efficient execution.
-
- Not all of these fields are used in every packet. The specific packet
- formats are described below. If a field is not mentioned in the
- description of a packet type, its use is assumed to be clear from the
- above description.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 44]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 3.3. Request Packet
-
- The Request packet (or packet group) is sent from the client to the
- server or group of servers to solicit processing plus the return of zero
- or more responses. A Request packet is identified by a 0 in the LSB of
- the fourth 32-bit word in the packet.
-
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- + Client (8 octets) +
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |Ver | |H|E|M| |
- |sion | Domain |C|P|P| Length |
- | | |O|G|G| |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |N|A|N|N|N|M|C|S|D|Retra|Forward| Inter- | |R|R|R| |
- |R|P|S|E|R|D|M|T|R|nsmit| Count | Packet | Prior |E|E|E|0|
- |S|G|R|R|T|G|G|I|T|Count| | Gap | -ity |S|S|S| |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Transaction |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | PacketDelivery |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- + Server (8 octets) +
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |C|D|M|S|R|C|M|P| |
- |M|G|D|D|E|R|R|I| RequestCode |
- |D|M|M|A|S|E|D|C| |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- + CoResidentEntity (8 octets) +
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- > User Data (12 octets) <
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | MsgDelivery |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | SegmentSize |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- > segment data, if any <
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Checksum |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
- Figure 3-1: Request Packet Format
-
- The fields of the Request packet are set according to the semantics
- described in Section 3.2 with the following qualifications.
-
-
- Cheriton [page 45]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- InterPacketGap The estimated interpacket gap time the client would like
- for the Response packet group to be sent by the Server
- in responding to this Request.
-
- Transaction Identifier for transaction, at least one greater than
- the previously issued Request from this Client.
-
- Server Server to which this Request is destined.
-
- RequestCode Request code for this request, indicating the operation
- to perform.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 46]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 3.4. Response Packet
-
- The Response packet is sent from the Server to the Client in response to
- a Request, identified by a 1 in the LSB of the fourth 32-bit word in the
- packet.
-
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- + Client (8 octets) +
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |Ver | |H|E|M| |
- |sion | Domain |C|P|P| Length |
- | | |O|G|G| |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |N|A|N|N|N|R|C|S|R|Retra|Forward| | |R|R|R| |
- |R|P|S|E|R|E|M|T|E|nsmit| Count | PGcount | Prior |E|E|E|1|
- |S|G|R|R|T|S|G|I|S|Count| | | -ity |S|S|S| |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Transaction |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | PacketDelivery |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- + Server (8 octets) +
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |C|D|M|S|R|R|R|R| |
- |M|G|D|D|E|E|E|E| ResponseCode |
- |D|M|M|A|S|S|S|S| |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- > UserData (20 octets) <
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | MsgDelivery |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Segment Size |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- > segment data, if any <
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Checksum |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
- Figure 3-2: Response Packet Format
-
- The fields of the Response packet are set according to the semantics
- described in Section 3.2 with the following qualifications.
-
- Client, Version, Domain, Transaction
- Match those in the Request packet group to which this is
-
-
- Cheriton [page 47]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- a response.
-
- STI 1 if this Response is using one or more of the
- transaction identifiers skipped by the Client after the
- Request to which this is a Response. STI in the Request
- essentially allocates up to 256 transaction identifiers
- for the Server to use in a run of Response packet
- groups.
-
- RetransmitCount The retransmit count from the last Request packet
- received to which this is a response.
-
- ForwardCount The number of times the corresponding Request was
- forwarded before this Response was generated.
-
- PGcount The number of consecutively previous packet groups that
- this response is acknowledging in addition to the one
- identified by the Transaction identifier.
-
- Server Server sending this response. This may differ from that
- originally specified in the Request packet if the
- original Server was a server group, or the request was
- forwarded.
-
- The next two chapters describes the protocol operation using these
- packet formats, with the the Client and the Server portions described
- separately.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 48]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 4. Client Protocol Operation
-
- This chapter describes the operation of the client portion of VMTP in
- terms of the procedures for handling VMTP user events, packet reception
- events, management operations and timeout events. Note that the client
- portion of VMTP is separable from the server portion. It is feasible to
- have a node that only implements the client end of VMTP.
-
- To simplify the description, we define a client state record (CSR) plus
- some standard utility routines.
-
-
- 4.1. Client State Record Fields
-
- In the following protocol description, there is one client state record
- (CSR) per (client,transaction) outstanding message transaction. Here is
- a suggested set of fields.
-
- Link Link to next CSR when queued in one of the transmission,
- timeout or message queues.
-
- QueuePtr Pointer to queue head in which this CSR is contained or
- NULL if none. Queue could be one of transmission queue,
- timeout queue, server queue or response queue.
-
- ProcessIdentification
- The process identification and address space.
-
- Priority Priority for processing, network service, etc.
-
- State One of the client states described below.
-
- FinishupFunc Procedure to be executed on the CSR when it is completes
- its processing in transmission or timeout queues.
-
- TimeoutCount Time to remain in timeout queue.
-
- TimeoutLimit User-specified time after which the message transaction
- is aborted. The timeout is infinite if set to zero.
-
- RetransCount Number of retransmissions since last hearing from the
- Server.
-
- LastTransmitTime
- The time at which the last packet was sent. This field
- is used to calculate roundtrip times, using the
- RetransmitCount to match the responding packet to a
-
-
- Cheriton [page 49]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- particular transmission. I.e. Response or management
- NotifyVmtpClient operation to Request and a management
- NotifyVmtpServer operation to a Response.
-
- TimetoLive Time to live to be used on transmission of IP packets.
-
- TransmissionMask
- Bit mask indicating the portions of the segment to
- transmit. Set before entering the transmission queue
- and cleared incrementally as the 512-byte segment blocks
- of the segment are transmitted.
-
- LocalClientLink Link to next CSR hashing to same hash index in the
- ClientMap.
-
- LocalClient Entity identifier for client when this CSR is used to
- send a Request packet.
-
- LocalTransaction
- Transaction identifier for current message transaction
- the local client has outstanding.
-
- LocalPrincipal Account identification, possibly including key and key
- timeout.
-
- LocalDelivery Bit mask of segment blocks that have not been
- acknowledged in the Request or have been received in the
- Response, depending on the state.
-
- ResponseQueue Queue of CSR's representing the queued Responses for
- this entity.
-
- VMTP Header Prototype VMTP header, used to generate and store the
- header portion of a Request for transmission and
- retransmission on timeout.
-
- SegmentDesc Description of the segment data associated with the CSR,
- either the area storing the original Request data, the
- area for receiving Request data, or the area storing the
- Response data that is returned.
-
- HostAddr The network or internetwork host address to which the
- Client last transmitted. This field also indicates the
- type of the address, e.g. IP, Ethernet, etc.
-
- Note: the CSR can be combined with a light-weight process descriptor
- with considerable benefit if the process is designed to block when it
-
-
- Cheriton [page 50]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- issues a message transaction. In particular, by combining the two
- descriptors, the implementation saves time because it only needs to
- locate and queue one descriptor with various operations (rather than
- having to locate two descriptors). It also saves space, given that the
- VMTP header prototype provides space such as the user data field which
- may serve to store processor state for when the process is preempted.
- Non-preemptive blocking can use the process stack to store the processor
- state so only a program counter and stack pointer may be required in the
- process descriptor beyond what we have described. (This is the approach
- used in the V kernel.)
-
-
- 4.2. Client Protocol States
-
- A Client State Record records the state of message transaction generated
- by this host, identified by the (Client, Transaction) values in the CSR.
- As a client originating a transaction, it is in one of the following
- states.
-
- AwaitingResponse
- Waiting for a Response packet group to arrive with the
- same (Client,Transaction) identification.
-
- ReceivingResponse
- Waiting for additional packets in the Response packet
- group it is currently receiving.
-
- "Other" Not waiting for a response, which can be Processing or
- some other operating system state, or one of the Server
- states if it also acts as a server.
-
- This covers all the states for a client.
-
-
- 4.3. State Transition Diagrams
-
- The client state transitions are illustrated in Figure 4-1. The client
- goes into the state AwaitingResponse on sending a request unless it is a
- datagram request. In the AwaitingResponse state, it can timeout and
- retry and eventually give up and return to the processing state unless
- it receives a Response. (A NotifyVmtpClient operation resets the
- timeout but does not change the state.) On receipt of a single packet
- response, it returns to the processing state. Otherwise, it goes to
- ReceivingResponse state. After timeout or final response packet is
- received, the client returns to the processing state. The processing
- state also includes any other state besides those associated with
- issuing a message transaction.
-
-
- Cheriton [page 51]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- +------------+
- | Processing |<--------------------|
- | |<-------------| |
- | |<---| | |
- +|------^--^-+ Single Last |
- Transmit | | Packet Response |
- | | | Response Packet |
- | | | | | |
- +-DGM->+ Timeout | | Final timeout
- | | | | |
- +V-----------+ | +-----------+
- | Awaiting |----+ | Receiving |->Response-+
- | Response |->Response->| Response | |
- | | (multi- | |<----------+
- +-|--------^-+ packet) +----------^+
- V | | |
- +-Timeout+ +>Timeout+
-
- Figure 4-1: Client State Transitions
-
-
- 4.4. User Interface
-
- The RPC or user interface to VMTP is implementation-dependent and may
- use systems calls, functions or some other mechanism. The list of
- requests that follow is intended to suggest the basic functionality that
- should be available.
-
- Send( mcb, timeout, segptr, segsize )
- Initiate a message transaction to the server and request
- message specified by mcb and return a response in mcb,
- if it is received within the specified timeout period
- (or else return USER_TIMEOUT in the Code field). The
- segptr parameter specifies the location from which the
- segment data is sent and the location into which the
- response data is to be delivered. The segsize field
- indicates the maximum length of this area.
-
- GetResponse( responsemcb, timeout, segptr, segsize )
- Get the next response sent to this client as part of the
- current message transaction, returning the segment data,
- if any, into the memory specified by segptr and segsize.
-
- This interface assumes that there is a client entity associated with the
- invoking process that is to be used with these operations. Otherwise,
- the client entity must be specified as an additional parameter.
-
-
-
- Cheriton [page 52]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 4.5. Event Processing
-
- The following events may occur in the VMTP client:
-
- - User Requests
-
- * Send
-
- * GetResponse
-
- - Packet Arrival
-
- * Response Packet
-
- * Request
-
- The minimal Client implementation handles Request packets for
- its VMTP management (server) module and sends NotifyVmtpClient
- requests in response to others, indicating the specified
- server does not exist.
-
- - Management Operation - NotifyVmtpClient
-
- - Timeouts
-
- * Client Retransmission Timeout
-
- The handling of these events is described in detail in the following
- subsections.
-
- We first describe some conventions and procedures used in the
- description. A field of the received packet is indicated as (for
- example) p.Transaction, for the Transaction field. Optional portions of
- the code, such as the streaming handling code are prefixed with a "|" in
- the first column.
-
- MapClient( client )
- Return pointer to CSR for client with the specified
- clientId, else NULL.
-
- SendPacketGroup( csr )
- Send the packet group (Request, Response) according to
- that specified by the CSR.
-
- NotifyClient( csr, p, code )
- Invoke the NotifyVmtpClient operation with the
- parameters csr.RemoteClient, p.control,
-
-
- Cheriton [page 53]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- csr.ReceiveSeqNumber, csr.RemoteTransaction and
- csr.RemoteDelivery, and code. If csr is NULL, use
- p.Client, p.Transaction and p.PacketDelivery instead and
- the global ReceiveSequenceNumber, if supported. This
- function simplifies the description over calling
- NotifyVmtpClient directly in the procedural
- specification below. (See Appendix III.)
-
- NotifyServer( csr, p, code )
- Invoke the NotifyVmtpServer operation with the
- parameters p.Server, csr.LocalClient,
- csr.LocalTransaction, csr.LocalDelivery and code. Use
- p.Client, P.Transaction and 0 for the clientId, transact
- and delivery parameters if csr is NULL. This function
- simplifies the description over calling NotifyVmtpServer
- directly in the procedural specification below. (See
- Appendix III.)
-
- DGMset(p) True if DGM bit set in packet (or csr) else False.
- (Similar functions are used for other bits.)
-
- Timeout( csr, timeperiod, func )
- Set or reset timer on csr record for timeperiod later
- and invoke func if the timeout expires.
-
-
- 4.6. Client User-invoked Events
-
- A user event occurs when a VMTP user application invokes one of the VMTP
- interface procedures.
-
-
- 4.6.1. Send
-
- Send( mcb, timeout, segptr, segsize )
- map to main CSR for this client.
- increment csr.LocalTransaction
- Init csr and check parameters and segment if any.
- Set SDA if sending appended data.
- Flush queued replies from previous transaction, if any.
- if local non-group server then
- deliver locally
- await response
- return
- if GroupId(server) then
- Check for and deliver to local members.
- if CRE request and non-group local CR entity then
-
-
- Cheriton [page 54]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- await response
- return
- endif
- set MDG if member of this group.
- endif
- clear csr.RetransCount
- set csr.TransmissionMask
- set csr.TimeLimit to timeout
- set csr.HostAddr for csr.Server
- SendPacketGroup( csr )
- if DGMset(csr) then
- return
- endif
- set csr.State to AwaitingResponse
- Timeout( rootcsr, TC1(csr.Server), LocalClientTimeout )
- return
- end Send
-
- Notes:
-
- 1. Normally, the HostAddr is extracted from the ServerHost
- cache, which maps server entity identifiers to host
- addresses. However, on cache miss, the client first queries
- the network using the ProbeEntity operation, as specified in
- Appendix III, determining the host address from the Response.
- The ProbeEntity operation is handled as a separate message
- transaction by the Client.
-
- The stream interface incorporates a parameter to pass a responseHandler
- procedure that is invoked when the message transaction completes.
-
- StreamSend( mcb, timeout, segptr, segsize, responseHandler )
- map to main CSR for this client.
- | Allocate a new csr if root in use.
- | lastcsr := First csr for last request.
- | if STIset(lastcsr)
- | csr.LocalTransaction := lastcsr.LocalTransaction + 256
- | else
- | csr.LocalTransaction := lastcsr.LocalTransaction + 1
- Init csr and check parameters and segment if any.
- . . . ( rest is the same as for the normal Send)
-
- Notes:
-
- 1. Each outstanding message transaction is represented by a CSR
- queued on the root CSR for this client entity. The root CSR
- is used to handle timeouts, etc. On timeout, the last packet
-
-
- Cheriton [page 55]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- from the last packet group is retransmitted (with or without
- the segment data).
-
-
- 4.6.2. GetResponse
-
- GetResponse( req, timeout, segptr, segsize )
- csr := CurrentCSR;
- if responses queued then return next response
- (in req, segptr to max of segsize )
- if timeout is zero then return KERNEL_TIMEOUT error
- set state to AWAITING_RESPONSE
- Timeout( csr, timeout, ReturnKernelTimeout );
- end GetResponse
-
- Notes:
-
- 1. GetResponse is only used with multicast Requests, which is
- the only case in which multiple (different) Responses should
- be received.
-
- 2. A response must remain queued until the next message
- transaction is invoked to filter out duplicates of this
- response.
-
- 3. If the response is incomplete (only relevant if a
- multi-packet response), then the client may wait for the
- response to be fully received, including issuing requests for
- retransmission (using NotifyVmtpServer operations) before
- returning the response.
-
- 4. As an optimization, a response may be stored in the CSR of
- the client. In this case, the response must be transferred
- to a separate buffer (for duplicate suppression) before
- waiting for another response. Using this optimization, a
- response buffer is not allocated in the common case of the
- client receiving only one response.
-
-
- 4.7. Packet Arrival
-
- In general, on packet reception, a packet is mapped to the client state
- record, decrypted if necessary using the key in the CSR. It then has
- its checksum verified and then is transformed to the right byte order.
- The packet is then processed fully relative to its packet function code.
- It is discarded immediately if it is addressed to a different domain
- than the domain(s) in which the receiving host participates.
-
-
- Cheriton [page 56]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- For each of the 2 packet types, we assume a procedure called with a
- pointer p to the VMTP packet and psize, the size of the packet in
- octets. Thus, generic packet reception is:
-
- if not LocalDomain(p.Domain) then return;
-
- csr := MapClient( p.Client )
-
- if csr is NULL then
- HandleNoCsr( p, psize )
- return
-
- if Secure(p) then
- if SecureVMTP not supported then
- { Assume a Request. }
- if not Multicast(p) then
- NotifyClient(NULL, p, SECURITY_NOT_SUPPORTED )
- return
- endif
- | Decrypt( csr.Key, p, psize )
-
- if p.Checksum not null then
- if not VerifyChecksum(p, psize) then return;
- if OppositeByteOrder(p) then ByteSwap( p, psize )
- if psize not equal sizeof(VmtpHeader) + 4*p.Length then
- NotifyClient(NULL, p, VMTP_ERROR )
- return
- Invoke Procedure[p.FuncCode]( csr, p, psize )
- Discard packet and return
-
- Notes:
-
- 1. The Procedure[p.FuncCode] refers to one of the 2 procedures
- corresponding to the two different packet types of VMTP,
- Requests and Responses.
-
- 2. In all the following descriptions, a packet is discarded on
- "return" unless otherwise stated.
-
- 3. The procedure HandleNoCSR is a management routine that
- allocates and initializes a CSR and processes the packet or
- else sends an error indication to the sender of the packet.
- This procedure is described in greater detail in Section
- 4.8.1.
-
-
-
-
-
- Cheriton [page 57]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 4.7.1. Response
-
- This procedure handles incoming Response packets.
-
- HandleResponse( csr, p, psize )
- if not LocalClient( csr ) then
- if Multicast then return
- | if Migrated( p.Client ) then
- | NotifyServer(csr, p ENTITY_MIGRATED )
- | else
- NotifyServer(csr, p, ENTITY_NOT_HERE )
- return
- endif
-
- if NSRset(p) then
- if Streaming not supported then
- NotifyServer(csr, p, STREAMING_NOT_SUPPORTED )
- return STREAMED_RESPONSE
- | Find csr corresponding to p.Transaction
- | if none found then
- | NotifyServer(csr, p, BAD_TRANSACTION_ID )
- | return
- else
- if csr.LocalTransaction not equal p.Transaction then
- NotifyServer(csr, p, BAD_TRANSACTION_ID )
- return
- endif
- Locate reply buffer rb for this p.Server
- if found then
- if rb.State is not ReceivingResponse then
- { Duplicate }
- if APGset(p) or NERset(p) then
- { Send Response to stop response packets. }
- NotifyServer(csr, p, RESPONSE_DISCARDED )
- endif
- return
- endif
- { rb.State is ReceivingRequest}
- if new segment data then retain in CSR segment area.
- if packetgroup not complete then
- Timeout( rb, TC3(p.Server), LocalClientTimeout )
- return;
- endif
- goto EndPacketGroup
- endif
- { Otherwise, a new response message. }
-
-
-
- Cheriton [page 58]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- if (NSRset(p) or NERset(p)) and NoStreaming then
- NotifyServer(csr, p, VMTP_ERROR )
- return
- | if NSRset(p) then
- | { Check consecutive with previous packet group }
- | Find last packet group CSR from p.Server.
- | if p.Transaction not
- | lastcsr.RemoteTransaction+1 mod 2**32 then
- | { Out of order packet group }
- | NotifyServer(csr, p, BAD_TRANSACTION_ID)
- | return
- | endif
- | if lastcsr not completed then
- | NotifyServer(lastcsr, p, RETRY )
- | endif
- | if CMG(lastcsr) then
- | Add segment data to lastcsr Response
- | Notify lastcsr with new packet group.
- | Clear lastcsr.VerifyInterval
- | else
- | if lastcsr available then
- | use it for this packet group
- | else allocate and initialize new CSR
- | Save message and segment data in new CSR area.
- | endif
- | else { First packet group }
- Allocate and init reply buffer rb for this response.
- if allocation fails then
- NotifyServer(csr, p, BUSY )
- return
- Set rb.State to ReceivingResponse
- Copy message and segment data to rb's segment area
- and set rb.PacketDelivery to that delivered.
- Save p.Server host address in ServerHost cache.
- endif
- if packetgroup not complete then
- Timeout( rb, TS1(p.Client), LocalClientTimeout )
- return;
- endif
- endPacketGroup:
- { We have received last packet in packet group. }
- if APGset(p) then NotifyServer(csr, p, OK )
- | if NERset(p) and CMGset(p) then
- | Queue waiting for continuation packet group.
- | Timeout( rb, TC2(rb.Server), LocalClientTimeout )
- | return
- | endif
-
-
- Cheriton [page 59]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- { Deliver response message. }
- Deliver response to Client, or queue as appropriate.
- end HandleResponse
-
- Notes:
-
- 1. The mechanism for handling streaming is optional and can be
- replaced with the tests for use of streaming. Note that the
- server should never stream at the Client unless the Client
- has streamed at the Server or has used the STI control bit.
- Otherwise, streamed Responses are a protocol error.
-
- 2. As an optimization, a Response can be stored into the CSR for
- the Client rather than allocating a separate CSR for a
- response buffer. However, if multiple responses are handled,
- the code must be careful to perform duplicate detection on
- the Response stored there as well as those queued. In
- addition, GetResponse must create a queued version of this
- Response before allowing it to be overwritten.
-
- 3. The handling of Group Responses has been omitted for brevity.
- Basically, a Response is accepted if there has been a Request
- received locally from the same Client and same Transaction
- that has not been responded to. In this case, the Response
- is delivered to the Server or queued.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 60]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 4.8. Management Operations
-
- VMTP uses management operations (invoked as remote procedure calls) to
- effectively acknowledge packet groups and request retransmissions. The
- following routine is invoked by the Client's management module on
- request from the Server.
-
- NotifyVmtpClient( clientId,ctrl,receiveSeqNumber,transact,delivery,code)
- Get csr for clientId
- if none then return
- if RemoteClient( csr ) and not NotifyVmtpRemoteClient then
- return
- | else (for streaming)
- | Find csr with same LocalTransaction as transact
- | if csr is NULL then return
- if csr.State not AwaitingResponse then return
- if ctrl.PGcount then ack previous packet groups.
- select on code
- case OK:
- Notify ack'ed segment blocks from delivery
- Clear csr.RetransCount;
- Timeout( csr, TC1(csr.Server), LocalClientTimeout )
- return
- case RETRY:
- Set csr.TransmissionMask to missing segment blocks,
- as specified by delivery
- SendPacketGroup( csr )
- Timeout( csr, TC1(csr.Server), LocalClientTimeout )
- case RETRY_ALL
- Set csr.TransmissionMask to retransmit all blocks.
- SendPacketGroup( csr )
- Timeout( csr, TC1(csr.Server), LocalClientTimeout )
- | if streaming then
- | Restart transmission of packet groups,
- | starting from transact+1
- return
- case BUSY:
- if csr.TimeLimit exceeded then
- Set csr.Code to USER_TIMEOUT
- return Response to application
- return;
- Set csr.TransmissionMask for full retransmission
- Clear csr.RetransCount
- Timeout( csr, TC1(csr.Server), LocalClientTimeout )
- return
- case ENTITY_MIGRATED:
- Get new host address for entity
-
-
- Cheriton [page 61]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Set csr.TransmissionMask for full retransmission
- Clear csr.RetransCount
- SendPacketGroup( csr )
- Timeout( csr, TC1(csr.Server), LocalClientTimeout )
- return
-
- case STREAMING_NOT_SUPPORTED:
- Record that server does not support streaming
- if CMG(csr) then forget this packet group
- else resend Request as separate packet group.
- return
- default:
- Set csr.Code to code
- return Response to application
- return;
- endselect
- end NotifyVmtpClient
-
- Notes:
-
- 1. The delivery parameter indicates the segment blocks received
- by the Server. That is, a 1 bit in the i-th position
- indicates that the i-th segment block in the segment data of
- the Request was received. All subsequent NotifyVmtpClient
- operations for this transaction should be set to acknowledge
- a superset of the segment blocks in this packet. In
- particular, the Client need not be prepared to retransmit the
- segment data once it has been acknowledged by a Notify
- operation.
-
-
- 4.8.1. HandleNoCSR
-
- HandleNoCSR is called when a packet arrives for which there is no CSR
- matching the client field of the packet.
-
- HandleNoCSR( p, psize )
- if Secure(p) then
- if SecureVMTP not supported then
- { Assume a Request }
- if not Multicast(p) then
- NotifyClient(NULL,p,SECURITY_NOT_SUPPORTED)
- return
- endif
- HandleRequestNoCSR( p, psize )
- return
- endif
-
-
- Cheriton [page 62]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- if p.Checksum not null then
- if not VerifyChecksum(p, psize) then return;
- if OppositeByteOrder(p) then ByteSwap( p, psize )
- if psize not equal sizeof(VmtpHeader) + 4*p.Length then
- NotifyClient(NULL, p, VMTP_ERROR )
- return
-
- if p.FuncCode is Response then
- | if Migrated( p.Client ) then
- | NotifyServer(csr, p ENTITY_MIGRATED )
- | else
- NotifyServer(csr, p, NONEXISTENT_ENTITY )
- return
- endif
-
- if p.FuncCode is Request then
- HandleRequestNoCSR( p, psize )
- return
- end HandleNoCSR
-
- Notes:
-
- 1. The node need only check to see if the client entity has
- migrated if in fact it supports migration of entities.
-
- 2. The procedure HandleRequestNoCSR is specified in Section
- 5.8.1. In the minimal client version, it need only handle
- Probe requests and can do so directly without allocating a
- new CSR.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 63]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 4.9. Timeouts
-
- A client with a message transaction in progress has a single timer
- corresponding to the first unacknowledged request message. (In the
- absence of streaming, this request is also the last request sent.) This
- timeout is handled as follows:
-
- LocalClientTimeout( csr )
- select on csr.State
- case AwaitingResponse:
- if csr.RetransCount > MaxRetrans(csr.Server) then
- terminate Client's message transactions up to
- and including the current message transaction.
- set return code to KERNEL_TIMEOUT
- return
- increment csr.RetransCount
- Resend current packet group with APG set.
- Timeout( csr, TC2(csr.Server), LocalClientTimeout )
- return
- case ReceivingResponse:
- if DGMset(csr) or csr.RetransCount > Max then
- if MDMset(csr) then
- Set MCB.MsgDeliveryMask to blocks received.
- else
- Set csr.Code to BAD_REPLY_SEGMENT
- return to user Client
- endif
- increment csr.RetransCount
- NotifyServer with RETRY
- Timeout( csr, TC3(csr.Server), LocalClientTimeout )
- return
- end select
- end LocalClientTimeout
-
- Notes:
-
- 1. A Client can only request retransmission of a Response if the
- Response is not idempotent. If idempotent, it must
- retransmit the Request. The Server should generally support
- the MsgDeliveryMask for Requests that it treats as idempotent
- and that require multi-packet Responses. Otherwise, there is
- no selective retransmission for idempotent message
- transactions.
-
- 2. The current packet group is the last one transmitted. Thus,
- with streaming, there may be several packet groups
- outstanding that precede the current packet group.
-
-
- Cheriton [page 64]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 3. The Request packet group should be retransmitted without the
- segment data, resulting in a single short packet in the
- retransmission. The Server must then send a
- NotifyVmtpClient with a RETRY or RETRY_ALL code to get the
- segment data transmitted as needed. This strategy minimizes
- the overhead on the network and the server(s) for
- retransmissions.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 65]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 5. Server Protocol Operation
-
- This section describes the operation of the server portion of the
- protocol in terms of the procedures for handling VMTP user events,
- packet reception events and timeout events. Each server is assumed to
- implement the client procedures described in the previous chapter.
- (This is not strictly necessary but it simplifies the exposition.)
-
-
- 5.1. Remote Client State Record Fields
-
- The CSR for a server is extended with the following fields, in addition
- to the ones listed for the client version.
-
- RemoteClient Identifier for remote client that sent the Request that
- this CSR is handling.
-
- RemoteClientLink
- Link to next CSR hashing to same hash index in the
- ClientMap.
-
- RemoteTransaction
- Transaction identifier for Request from remote client.
-
- RemoteDelivery The segment blocks received so far as part of a Request
- or yet to be acknowledged as part of a Response.
-
- VerifyInterval Time interval since there was confirmation that the
- remote Client was still valid.
-
- RemotePrincipal Account identification, possibly including key and key
- timeout for secure communication.
-
-
- 5.2. Remote Client Protocol States
-
- A CSR in the server end is in one of the following states.
-
- AwaitingRequest Waiting for a Request packet group. It may be marked as
- waiting on a specific Client, or on any Client.
-
- ReceivingRequest
- Waiting to receive additional Request packets in a
- multi-packet group Request.
-
- Responded The Response has been sent and the CSR is timing out,
- providing duplicate suppression and retransmission (if
-
-
- Cheriton [page 66]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- the Response was not idempotent).
-
- ResponseDiscarded
- Response has been acknowledged or has timed out so
- cannot be retransmitted. However, duplicates are still
- filtered and CSR can be reused for new message
- transaction.
-
- Processing Executing on behalf of the Client.
-
- Forwarded The message transaction has been forwarded to another
- Server that is to respond directly to the Client.
-
-
- 5.3. State Transition Diagrams
-
- The CSR state transitions in the server are illustrated in Figure 5-1.
- The CSR generally starts in the AwaitingRequest state. On receipt of a
- Request, the Server either has an up-to-date CSR for the Client or else
- it sends a Probe request (as a separate VMTP message transaction) to the
- VMTP management module associated with the Client. In the latter case,
- the processing of the Request is delayed until a Response to the Probe
- request is received. At that time, the CSR information is brought up to
- date and the Request is processed. If the Request is a single-packet
- request, the CSR is then set in the Processing state to handle the
- request. Otherwise (a multi-packet Request), the CSR is put into the
- ReceivingResponse state, waiting to receive subsequent Request packets
- that constitute the Request message. It exits the ReceivingRequest
- state on timeout or on receiving the last Request packet. In the former
- case, the request is delivered with an indication of the portion
- received, using the MsgDelivery field if MDM is set. After request
- processing is complete, either the Response is sent and the CSR enters
- the Responded state or the message transaction is forwarded and the CSR
- enters the Forwarded state.
-
- In the Responded state, if the Response is not marked as idempotent, the
- Response is retransmitted on receipt of a retransmission of the
- corresponding Request, on receipt of a NotifyVmtpServer operation
- requesting retransmission or on timeout at which time APG is set,
- requesting an acknowledgment from the Client. The Response is
- retransmitted some maximum number of times at which time the Response is
- discarded and the CSR is marked accordingly. If a Request or a
- NotifyVmtpServer operation is received expecting retransmission of the
- Response after the CSR has entered the ResponseDiscarded state, a
- NotifyVmtpClient operation is sent back (or invoked in the Client
- management module) indicating that the response was discarded unless the
- Request was multicast, in which case no action is taken. After a
-
-
- Cheriton [page 67]
-
-
- RFC 1045 VMTP February 1988
-
-
- (Retransmit Forwarded Request and NotifyVmtpClient)
- Request/
- Ack/
- +Timeout+
- V |
- +-|-------^-+
- | |
- +-Time-| Forwarded |<-------------+
- | out +-----------+ |
- | |
- | (Retransmit Response) |
- | Request |
- V Ack |
- | +-Timeout-+ |
- | V | |
- +---------+ Ack/ +|---------^+ |
- +-Time-|Response |<-Timeout--| Responded | |
- | out |Discarded| +----^------+ |
- | +---------+ | |
- | +------------+ | |
- | | |->-Send Response-+ |
- | | |->-forward Request--------+
- +->| Processing |<----------------------+
- | | |<----------------+ |
- | | |<---| | |
- | +-|--------^-+ | Last |
- | Receive | | Request |
- | | Timeout Single Packet |
- | | | Packet | Timeout
- | | | Request ^ ^
- | | | ^ +|-----|--+
- | +-V--------|-+ | |Receiving|<-+Time
- +->| Awaiting |->--+->Request->| Request |--+ out
- | Request | | (multi- +---------+
- +------|-----+ ^ packet)
- Request |
- | Response
- Send Probe to
- | Probe
- +---V----+ |
- |Awaiting| ^
- |Response|-->--+
- |to Probe|
- +--------+
-
- Figure 5-1: Remote Client State Transitions
-
- timeout corresponding to the time required to filter out duplicates, the
-
- Cheriton [page 68]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- CSR returns either to the AwaitingRequest state or to the Processing
- state. Note that "Ack" refers to acknowledgment by a Notify operation.
-
- A Request that is forwarded leaves the CSR in the Forwarded state. In
- the Forwarded state, the forwarded Request is retransmitted
- periodically, expecting NotifyRemoteClient operations back from the
- Server to which the Request was forwarded, analogous to the Client
- behavior in the AwaitingResponse state. In this state, a
- NotifyRemoteClient from this Server acknowledges the Request or asks
- that it be retransmitted or reports an error. A retransmission of the
- Request from the Client causes a NotifyVmtpClient to be returned to the
- Client if APG is set. The CSR leaves the Forwarded state after timing
- out in the absence of NotifyRemoteClient operations from the forward
- Server or on receipt of a NotifyRemoteClient operation indicating the
- forward Server has sent a Response and received an acknowledgement. It
- then enters the ResponseDiscarded state.
-
- Receipt of a new Request from the same Client aborts the current
- transaction, independent of its state, and initiates a new transaction
- unless the new Request is part of a run of message transactions. If it
- is part of a run of message transactions, the handling follows the state
- diagram except the new Request is not Processed until there has been a
- response sent to the previous transaction.
-
-
- 5.4. User Interface
-
- The RPC or user interface to VMTP is implementation-dependent and may
- use systems calls, functions or some other mechanism. The list of
- requests that follow is intended to suggest the basic functionality that
- should be available.
-
- AcceptMessage( reqmcb, segptr, segsize, client, transid, timeout )
- Accept a new Request message in the specified reqmcb
- area, placing the segment data, if any, in the area
- described by segptr and segsize. This returns the
- Server in the entityId field of the reqmcb and actual
- segment size in the segsize parameters. It also returns
- the Client and Transaction for this message transaction
- in the corresponding parameters. This procedure
- supports message semantics for request processing. When
- a server process executes this call, it blocks until a
- Request message has been queued for the server.
- AcceptMessage returns after the specified timeout period
- if a message has not been received by that time.
-
- RespondMessage( responsemcb, client, transid, segptr )
-
-
- Cheriton [page 69]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Respond to the client with the specified response
- message and segment, again with message semantics.
-
- RespondCall( responsemcb, segptr )
- Respond to the client with the specified response
- message and segment, with remote procedure call
- semantics. This procedure does not return. The
- lightweight process that executes this procedure is
- matched to a stack, program counter, segment area and
- priority from the information provided in a
- ModifyService call, as specified in Appendix III.
-
- ForwardMessage( requestmcb, transid, segptr, segsize, forwardserver )
- Forward the client to the specified forwardserver with
- the request specified in mcb.
-
- ForwardCall( requestmcb, segptr, segsize, forwardserver )
- Forward the client transaction to the specified
- forwardserver with the request specified by requestmcb.
- This procedure does not return.
-
- GetRemoteClientId()
- Return the entityId for the remote client on whose
- behave the process is executing. This is only
- applicable in the procedure call model of request
- handling.
-
- GetForwarder( client )
- Return the entity that forwarded this Request, if any.
-
- GetProcess( client )
- Return an identifier for the process associated with
- this client entity-id.
-
- GetPrincipal( client )
- Return the principal associated with this client
- entity-id.
-
-
- 5.5. Event Processing
-
- The following events may occur in VMTP servers.
-
- - User Requests
-
- * Receive
-
-
-
- Cheriton [page 70]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- * Respond
-
- * Forward
-
- * GetForwarder
-
- * GetProcess
-
- * GetPrincipal
-
- - Packet Arrival
-
- * Request Packet
-
- - Management Operations
-
- * NotifyVmtpServer
-
- - Timeouts
-
- * Client State Record Timeout
-
- The handling of these events is described in detail in the following
- subsections. The conventions of the previous chapter are followed,
- including the use of the various subroutines in the description.
-
-
- 5.6. Server User-invoked Events
-
- A user event occurs when a VMTP server invokes one of the VMTP interface
- procedures.
-
-
- 5.6.1. Receive
-
- AcceptMessage(reqmcb, segptr, segsize, client, transid, timeout)
- Locate server's request queue.
- if request is queued then
- Remember CSR associated with this Request.
- return Request in reqmcb, segptr and segsize
- and client and transaction id.
- Wait on server's request queue for next request
- up time timeout seconds.
- end ReceiveCall
-
- Notes:
-
-
-
- Cheriton [page 71]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 1. If a multi-packet Request is partially received at the time
- of the AcceptMessage, the process waits until it completes.
-
- 2. The behavior of a process accepting a Request as a
- lightweight thread is similar except that the process
- executes using the Request data logically as part of the
- requesting Client process.
-
-
- 5.6.2. Respond
-
- RespondCall is described as one case of the Respond transmission
- procedure; RespondMessage is similar.
-
- RespondCall( responsemcb, responsesegptr )
- Locate csr for this client.
- Check segment data accessible, if any
- if local client then
- Handle locally
- return
- endif
- if responsemcb.Code is RESPONSE_DISCARDED then
- Mark as RESPONSE_DISCARDED
- return
- SendPacketGroup( csr )
- set csr.State to Responded.
- if DGM reply then { Idempotent }
- release segment data
- Timeout( csr, TS4(csr.Client), FreeCsr );
- else { Await acknowledgement or new Request else ask for ack. }
- Timeout( csr, TS5(csr.Client), RemoteClientTimeout )
- end RespondCall
-
- Notes:
-
- 1. RespondMessage is similar except the Server process must be
- synchronized with the release of the segment data (if any).
-
- 2. The non-idempotent Response with segment data is sent first
- without a request for an acknowledgement. The Response is
- retransmitted after time TS5(client) if no acknowledgment or
- new Request is received from the client in the meantime. At
- this point, the APG bit is sent.
-
- 3. The MCB of the Response is buffered in the client CSR, which
- remains for TS4 seconds, sufficient to filter old duplicates.
- The segment data (if any) must be retained intact until: (1)
-
-
- Cheriton [page 72]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- after transmission if idempotent or (2) after acknowledged or
- timeout has occurred if not idempotent. Techniques such as
- copy-on-write might be used to keep a copy of the Response
- segment data without incurring the cost of a copy.
-
-
- 5.6.3. Forward
-
- Forwarding is logically initiating a new message transaction between the
- Server (now acting as a Client) and the server to which the Request is
- forwarded. When the second server returns a Response, the same Response
- is immediately returned to the Client. The forwarding support in VMTP
- preserves these semantics while providing some performance optimizations
- in some cases.
-
- ForwardCall( req, segptr, segsize, forwardserver )
- Locate csr for this client.
- Check segment data accessible, if any
-
- if local client or Request was multicast or secure
- or csr.ForwardCount == 15 then
- Handle as a new Send operation
- return
- if forwardserver is local then
- Handle locally
- return
- Set csr.funccode to Request
- Increment csr.ForwardCount
- Set csr.State to Responded
- SendPacketGroup( csr ) { To ForwardServer }
- Timeout( csr, TS4(csr.Client), FreeAlien )
- end ForwardCall
-
- Notes:
-
- 1. A Forward is logically a new call or message transaction. It
- must be really implemented as a new message transaction if
- the original Request was multicast or secure (with the
- optional further refinement that it can be used with a secure
- message transaction when the Server and ForwardServer are the
- same principal and the Request was not multicast).
-
- 2. A Forward operation is never handled as an idempotent
- operation because it requires knowledge that the
- ForwardServer will treat the forwarded operation as
- idempotent as well. Thus, a Forward operation that includes
- a segment should set APG on the first transmission of the
-
-
- Cheriton [page 73]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- forwarded Request to get an acknowledgement for this data.
- Once the acknowledgement is received, the forwarding Server
- can discard the segment data, leaving only the basic CSR to
- handle retransmissions from the Client.
-
-
- 5.6.4. Other Functions
-
- GetRemoteClient is a simple local query of the CSR. GetProcess and
- GetPrincipal also extract this information from the CSR. A server
- module may defer the Probe callback to the Client to get that
- information until it is requested by the Server (assuming it is not
- using secure communication and duplicate suppression is adequate without
- callback.) GetForwarder is implemented as a callback to the Client,
- using a GetRequestForwarder VMTP management operation. Additional
- management procedures for VMTP are described in Appendix III.
-
-
- 5.7. Request Packet Arrival
-
- The basic packet reception follows that described for the Client
- routines. A Request packet is handled by the procedure HandleRequest.
-
- HandleRequest( csr, p, psize )
-
- if LocalClient(csr) then
- { Forwarded Request on local Client }
- if csr.LocalTransaction != p.Transaction then return
- if csr.State != AwaitingResponse then return
- if p.ForwardCount < csr.ForwardCount then
- Discard Request and return.
- Find a CSR for Client as a remote Client.
- if not found then
- if packet group complete then
- handle as a local message transaction
- return
- Allocate and init CSR
- goto newTransaction
- { Otherwise part of current transaction }
- { Handle directly below. }n
- if csr.RemoteTransaction = p.Transaction then
- { Matches current transaction }
- if OldForward(p.ForwardCount,csr.ForwardCount) then
- return
- if p.ForwardCount > csr.ForwardCount then
- { New forwarded transaction }
- goto newTransaction
-
-
- Cheriton [page 74]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- { Otherwise part of current transaction }
- if csr.State = ReceivingRequest then
- if new segment data then retain in CSR segment area.
- if Request not complete then
- Timeout( csr, TS1(p.Client), RemoteClientTimeout )
- return;
- endif
- goto endPacketGroup
- endif
- if csr.State is Responded then
- { Duplicate }
- if csr.Code is RESPONSE_DISCARDED
- and Multicast(p) then
- return
- endif
- if not DGM(csr) then { Not idempotent }
- if SegmentData(csr) then set APG
- { Resend Response or Request, if Forwarded }
- SendPacketGroup( csr )
- timeout=if SegmentData(csr) then TS5(csr.Client)
- else TS4(csr.Client)
- Timeout( csr, timeout, RemoteClientTimeout )
- return
- { Else idempotent - fall thru to newTransaction }
- else { Presume it is a retransmission }
- NotifyClient( csr, p, OK )
- return
- else if OldTransaction(csr.RemoteTransact,p.Transaction) then
- return
- { Otherwise, a new message transaction. }
- newTransaction:
- Abort handling of previous transactions for this Client.
-
- if (NSRset(p) or NERset(p)) and NoStreaming then
- NotifyClient( csr, p, STREAMING_NOT_SUPPORTED )
- return
- | if NSRset(p) then { Streaming }
- | { Check that consecutive with previous packet group }
- | Find last packet group CSR from this client.
- | if p.Transaction not lastcsr.RemoteTransaction+1 mod 2**32
- | and not STIset(lastcsr) or
- | p.Transaction not lastcsr.RemoteTransaction+256 mod **32
- | then
- | { Out of order packet group }
- | NotifyClient(csr, p, BAD_TRANSACTION_ID )
- | return
- | endif
-
-
- Cheriton [page 75]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- | if lastcsr not completed then
- | NotifyClient( lastcsr, p, RETRY )
- | endif
- | if lastcsr available then use it for this packet group
- | else allocate and initialize new CSR
- | if CMG(lastcsr) then
- | Add segment data to lastcsr Request
- | Keep csr as record of this packet group.
- | Clear lastcsr.VerifyInterval
- | endif
- | else { First packet group }
- if MultipleRemoteClients(csr) then ScavengeCsrs(p.Client)
- Set csr.RemoteTransaction, csr.Priority
- Copy message and segment data to csr's segment area
- and set csr.PacketDelivery to that delivered.
- Clear csr.PacketDelivery
- Clear csr.VerifyInterval
- SaveNetworkAddress( csr, p )
- endif
- if packetgroup not complete then
- Timeout( csr, TS3(p.Client), RemoteClientTimeout )
- return;
- endif
- endPacketGroup:
- { We have received complete packet group. }
- if APG(p) then NotifyClient( csr, p, OK )
- endif
- | if NERset(p) and CMG(p) then
- | Queue waiting for continuation packet group.
- | Timeout( csr, TS3(csr.Client), RemoteClientTimeout )
- | return
- | endif
- { Deliver request message. }
- if GroupId(csr.Server) then
- For each server identified by csr.Server
- Replicate csr and associated data segment.
- if CMDset(csr) and Server busy then
- Discard csr and data
- else
- Deliver or invoke csr for each Server.
- if not DGMset(csr) then queue for Response
- else Timeout( csr, TS4(csr.Client), FreeCsr )
- endfor
- else
- if CMDset(csr) and Server busy then
- Discard csr and data
- else
-
-
- Cheriton [page 76]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Deliver or invoke csr for this server.
- if not DGMset(csr) then queue for Response
- else Timeout( csr, TS4(csr.Client), FreeCsr )
- endif
- end HandleRequest
-
- Notes:
-
- 1. A Request received that specifies a Client that is a local
- entity should be a Request forwarded by a remote server to a
- local Server.
-
- 2. An alternative structure for handling a Request sent to a
- group when there are multiple local group members is to
- create a remote CSR for each group member on reception of the
- first packet and deliver a copy of each packet to each such
- remote CSR as each packet arrives.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 77]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 5.8. Management Operations
-
- VMTP uses management operations (invoked as remote procedure calls) to
- effectively acknowledge packet groups and request retransmissions. The
- following routine is invoked by the Server's management module on
- request from the Client.
-
- NotifyVmtpServer(server,clientId,transact,delivery,code)
- Find csr with same RemoteTransaction and RemoteClient
- as clientId and transact.
- if not found or csr.State not Responded then return
- if DGMset(csr) then
- if transmission of Response in progress then
- Abort transmission
- if code is migrated then
- restart transmission with new host addr.
- if Retry then Report protocol error
- return
- endif
- select on code
- case RETRY:
- if csr.RetransCount > MaxRetrans(clientId) then
- if response data segment then
- Discard data and mark as RESPONSE_DISCARDED
- | if NERset(csr) and subsequent csr then
- | Deallocate csr and use later csr for
- | future duplicate suppression
- | endif
- return
- endif
- increment csr.RetransCount
- Set csr.TransmissionMask to missing segment blocks,
- as specified by delivery
- SendPacketGroup( csr )
- Timeout( csr, TS3(csr.Client), RemoteClientTimeout )
- case BUSY:
- if csr.TimeLimit exceeded then
- if response data segment then
- Discard data and mark as RESPONSE_DISCARDED
- | if NERset(csr) and subsequent csr then
- | Deallocate csr and use later csr for
- | future duplicate suppression
- | endif
- endif
- endif
- Set csr.TransmissionMask for full retransmission
- Clear csr.RetransCount
-
-
- Cheriton [page 78]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Timeout( csr, TS3(csr.Server), RemoteClientTimeout )
- return
-
- case ENTITY_MIGRATED:
- Get new host address for entity
- Set csr.TransmissionMask for full retransmission
- Clear csr.RetransCount
- SendPacketGroup( csr )
- Timeout( csr, TS3(csr.Server), RemoteClientTimeout )
- return
-
- case default:
- Abort transmission of Response if in progress.
- if response data segment then
- Discard data and mark as RESPONSE_DISCARDED
- if NERset(csr) and subsequent csr then
- Deallocate csr and use later csr for
- future duplicate suppression
- endif
- return
- endselect
- end NotifyVmtpServer
-
- Notes:
-
- 1. A NotifyVmtpServer operation requesting retransmission of
- the Response is acceptable only if the Response was not
- idempotent. When the Response is idempotent, the Client must
- be prepared to retransmit the Request to effectively request
- retransmission of the Response.
-
- 2. A NotifyVmtpServer operation may be received while the
- Response is being transmitted. If an error return, as an
- efficiency, the transmission should be aborted, as suggested
- when the Response is a datagram.
-
- 3. A NotifyVmtpServer operation indicating OK or an error
- allows the Server to discard segment data and not provide for
- subsequent retransmission of the Response.
-
-
- 5.8.1. HandleRequestNoCSR
-
- When a Request is received from a Client for which the node has no CSR,
- the node allocates and initializes a CSR for this Client and does a
- callback to the Client's VMTP management module to get the Principal,
- Process and other information associated with this Client. It also
-
-
- Cheriton [page 79]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- checks that the TransactionId is correct in order to filter out
- duplicates.
-
- HandleRequestNoCSR( p, psize )
- | if Secure(p) then
- | Allocate and init CSR
- | SaveSourceHostAddr( csr, p )
- | ProbeRemoteClient( csr, p, AUTH_PROBE )
- | if no response or error then
- | delete CSR
- | return
- | Decrypt( csr.Key, p, psize )
- | if p.Checksum not null then
- | if not VerifyChecksum(p, psize) then return;
- | if OppositeByteOrder(p) then ByteSwap( p, psize )
- | if psize not equal sizeof(VmtpHeader) + 4*p.Length then
- | NotifyClient(NULL, p, VMTP_ERROR )
- | return
- | HandleRequest( csr, p, psize )
- | return
- if Server does not exist then
- NotifyClient( csr, p, NONEXISTENT_ENTITY )
- return
- endif
- if security required by server then
- NotifyClient(csr, p, SECURITY_REQUIRED )
- return
- endif
- Allocate and init CSR
- SaveSourceHostAddr( csr, p );
- if server requires Authentication then
- ProbeRemoteClient( csr, p, AUTH_PROBE )
- if no response or error then
- delete CSR
- return
- endif
- { Setup immediately as a new message transaction }
- set csr.Server to p.Server
- set csr.RemoteTransaction to p.Transaction-1
-
- HandleRequest( csr, p, psize )
- endif
-
- Notes:
-
- 1. A Probe request is always handled as a Request not requiring
- authentication so it never generates a callback Probe to the
-
-
- Cheriton [page 80]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Client.
-
- 2. If the Server host retains remote client CSR's for longer
- than the maximum packet lifetime and the Request
- retransmission time, and the host has been running for at
- least that long, then it is not necessary to do a Probe
- callback unless the Request is secure. A Probe callback can
- take place when the Server asks for the Process or
- PrincipalId associated with the Client.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 81]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 5.9. Timeouts
-
- The server must implement a timeout for remote client CSRs. There is a
- timeout for each CSR in the server.
-
- RemoteClientTimeout( csr )
- select on csr.State
- case Responded:
- if RESPONSE_DISCARDED then
- mark as timed out
- Make a candidate for reuse.
- return
- if csr.RetransCount > MaxRetrans(Client) then
- discard Response
- mark CSR as RESPONSE_DISCARDED
- Timeout(csr, TS4(Client), RemoteClientTimeout)
- return
- increment csr.RetransCount
- { Retransmit Response or forwarded Request }
- Set APG to get acknowledgement.
- SendPacketGroup( csr )
- Timeout( csr, TS3(Client), RemoteClientTimeout )
- return
- case ReceivingRequest:
- if csr.RetransCount > MaxRetrans(csr.Client)
- or DGMset(csr) or NRTset(csr) then
- Modify csr.segmentSize and csr.MsgDelivery
- to indicate packets received.
- if MDMset(csr) then
- Invoke processing on Request
- return
- else
- discard Request and reuse CSR
- (Note: Need not remember Request discarded.)
- return
- increment csr.RetransCount
- NotifyClient( csr, p, RETRY )
- Timeout( csr, TS3(Client), RemoteClientTimeout )
- return
- default:
- Report error - invalid state for RemoteClientTimeout
- endselect
- end RemoteClientTimeout
-
- Notes:
-
- 1. When a CSR in the Responded state times out after discarding
-
-
- Cheriton [page 82]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- the Response, it can be made available for reuse, either by
- the same Client or a different one. The CSR should be kept
- available for reuse by the Client for as long as possible to
- avoid unnecessary callback Probes.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 83]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 6. Concluding Remarks
-
- This document represents a description of the current state of the VMTP
- design. We are currently engaged in several experimental
- implementations to explore and refine all aspects of the protocol.
- Preliminary implementations are running in the UNIX 4.3BSD kernel and in
- the V kernel.
-
- Several issues are still being discussed and explored with this
- protocol. First, the size of the checksum field and the algorithm to
- use for its calculation are undergoing some discussion. The author
- believes that the conventional 16-bit checksum used with TCP and IP is
- too weak for future high-speed networks, arguing for at least a 32-bit
- checksum. Unfortunately, there appears to be limited theory covering
- checksum algorithms that are suitable for calculation in software.
-
- Implementation of the streaming facilities of VMTP is still in progress.
- This facility is expected to be important for wide-area, long delay
- communication.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 84]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- I. Standard VMTP Response Codes
-
- The following are the numeric values of the response codes used in VMTP.
-
- 0 OK
-
- 1 RETRY
-
- 2 RETRY_ALL
-
- 3 BUSY
-
- 4 NONEXISTENT_ENTITY
-
- 5 ENTITY_MIGRATED
-
- 6 NO_PERMISSION
-
- 7 NOT_AWAITING_MSG
-
- 8 VMTP_ERROR
-
- 9 MSGTRANS_OVERFLOW
-
- 10 BAD_TRANSACTION_ID
-
- 11 STREAMING_NOT_SUPPORTED
-
- 12 NO_RUN_RECORD
-
- 13 RETRANS_TIMEOUT
-
- 14 USER_TIMEOUT
-
- 15 RESPONSE_DISCARDED
-
- 16 SECURITY_NOT_SUPPORTED
-
- 17 BAD_REPLY_SEGMENT
-
- 18 SECURITY_REQUIRED
-
- 19 STREAMED_RESPONSE
-
- 20 TOO_MANY_RETRIES
-
- 21 NO_PRINCIPAL
-
-
- Cheriton [page 85]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 22 NO_KEY
-
- 23 ENCRYPTION_NOT_SUPPORTED
-
- 24 NO_AUTHENTICATOR
-
- 25-63 Reserved for future VMTP assignment.
-
- Other values of the codes are available for use by higher level
- protocols. Separate protocol documents will specify further standard
- values.
-
- Applications are free to use values starting at 0x00800000 (hex) for
- application-specific return values.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 86]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- II. VMTP RPC Presentation Protocol
-
- For complete generality, the mapping of the procedures and the
- parameters onto VMTP messages should be defined by a RPC presentation
- protocol. In the absence of an accepted standard protocol, we define an
- RPC presentation protocol for VMTP as follows.
-
- Each procedure is assigned an identifying Request Code. The Request
- code serves effectively the same as a tag field of variant record,
- identifying the format of the Request and associated Response as a
- variant of the possible message formats.
-
- The format of the Request for a procedure is its Request Code followed
- by its parameters sequentially in the message control block until it is
- full.
-
- The remaining parameters are sent as part of the message segment data
- formatted according to the XDR protocol (RFC ??). In this case, the
- size of the segment is specified in the SegmentSize field.
-
- The Response for a procedure consists of a ResponseCode field followed
- by the return parameters sequentially in the message control block,
- except if there is a parameter returned that must be transmitted as
- segment data, its size is specified in the SegmentSize field and the
- parameter is stored in the SegmentData field.
-
- Attributes associated with procedure definitions should indicate the
- Flags to be used in the RequestCode. Request Codes are assigned as
- described below.
-
-
- II.1. Request Code Management
-
- Request codes are divided into Public Interface Codes and
- application-specific, according to whether the PIC value is set. An
- interface is a set of request codes representing one service or module
- function. A public interface is one that is to be used in multiple
- independently developed modules. In VMTP, public interface codes are
- allocated in units of 256 structured as
-
- +-------------+----------------+-------------------+
- | ControlFlags| Interface | Version/Procedure |
- +-------------+----------------+-------------------+
- 8 bits 16 bits 8 bits
-
- An interface is free to allocate the 8 bits for version and procedure as
- desired. For example, all 8 bits can be used for procedures. A module
- requiring more than 256 Version/Procedure values can be allocated
-
- Cheriton [page 87]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- multiple Interface values. They need not be consecutive Interface
- values.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 88]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- III. VMTP Management Procedures
-
- Standard procedures are defined for VMTP management, including creation,
- deletion and query of entities and entity groups, probing to get
- information about entities, and updating message transaction information
- at the client or the server.
-
- The procedures are implemented by the VMTP manager that constitutes a
- portion of every complete VMTP module. Each procedure is invoked by
- sending a Request to the VMTP manager that handles the entity specified
- in the operation or the local manager. The Request sent using the
- normal Send operation with the Server specified as the well-known entity
- group VMTP_MANGER_GROUP, using the CoResident Entity mechanism to direct
- the request to the specific manager that should handle the Request.
- (The ProbeEntity operation is multicast to the VMTP_MANAGER_GROUP if the
- host address for the entity is not known locally and the host address is
- determined as the host address of the responder. For all other
- operations, a ProbeEntity operation is used to determine the host
- address if it is not known.) Specifying co-resident entity 0 is
- interpreted as the co-resident with the invoking process. The
- co-resident entity identifier may also specify a group in which case,
- the Request is sent to all managers with members in this group.
-
- The standard procedures with their RequestCode and parameters are listed
- below with their semantics. (The RequestCode range 0xVV000100 to
- 0xVV0001FF is reserved for use by the VMTP management routines, where VV
- is any choice of control flags with the PIC bit set. The flags are set
- below as required for each procedure.)
-
- 0x05000101 - ProbeEntity(CREntity, entityId, authDomain) -> (code,
- <staterec>)
- Request and return information on the specified entity
- in the specified authDomain, sending the Request to the
- VMTP management module coresident with CREntity. An
- error return is given if the requested information
- cannot be provided in the specified authDomain. The
- <staterec> returned is structured as the following
- fields.
-
- Transaction identifier
- The current or next transaction
- identifier being used by the probed
- entity.
-
- ProcessId: 64 bits
- Identifier for client process. The
- meaning of this is specified as part of
-
-
- Cheriton [page 89]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- the Domain definition.
-
- PrincipalId The identifier for the principal or
- account associated with the process
- specified by ProcessId. The meaning of
- this field is specified as part of the
- Domain definition.
-
- EffectivePrincipalId
- The identifier for the principal or
- account associated with the Client port,
- which may be different from the
- PrincipalId especially if this is an
- nested call. The meaning of this field
- is specified as part of the Domain
- definition.
-
- The code field indicates whether this is an error
- response or not. The codes and their interpretation
- are:
-
- OK
- No error. Probe was completed OK.
-
- NONEXISTENT_ENTITY
- Specified entity does not exist.
-
- ENTITY_MIGRATED
- The entity has migrated and is no longer at the host to
- which the request was sent.
-
- NO_PERMISSION
- Entity has refused to provide ProbeResponse.
-
- VMTP_ERROR
- The Request packet group was in error relative to the
- VMTP protocol specification.
-
- "default"
- Some type of error - discard ProbeResponse.
-
- 0x0D000102 - AuthProbeEntity(CREntity,entityId,authDomain,randomId) ->
- (code,ProbeAuthenticator,EncryptType,EntityAuthenticator)
-
- Request authentication of the entity specified by
- entityId from the VMTP manager coresident with CREntity
- in authDomain authentication domain, returning the
-
-
- Cheriton [page 90]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- information contained in the return parameters. The
- fields are set the same as that specified for the basic
- ProbeResponse except as noted below.
-
- ProbeAuthenticator
- 20 bytes consisting of the EntityId, the
- randomId and the probed Entity's current
- Transaction value plus a 32-bit checksum
- for these two fields (checksummed using
- the standard packet Checksum algorithm),
- all encrypted with the Key supplied in
- the Authenticator.
-
- EncryptType An identifier that identifies the
- variant of encryption method being used
- by the probed Entity for packets it
- transmits and packets it is able to
- receive. (See Appendix V.) The
- high-order 8 bits of the EncryptType
- contain the XOR of the 8 octets of the
- PrincipalId associated with private key
- used to encrypt the EntityAuthenticator.
- This value is used by the requestor or
- Client as an aid in locating the key to
- decrypt the authenticator.
-
- EntityAuthenticator
- (returned as segment data) The
- ProcessId, PrincipalId,
- EffectivePrincipal associated with the
- ProbedEntity plus the private
- encryption/decryption key and its
- lifetime limit to be used for
- communication with the Entity. The
- authenticator is encrypted with a
- private key associated with the Client
- entity such that it can be neither read
- nor forged by a party not trusted by the
- Client Entity. The format of the
- Authenticator in the message segment is
- shown in detail in Figure III-1.
-
- Key: 64 bits Encryption key to be used for encrypting
- and decrypting packets sent to and
- received from the probed Entity. This
- is the "working" key for packet
- transmissions. VMTP only uses private
-
-
- Cheriton [page 91]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- +-----------------------------------------------+
- | ProcessId (8 octets) |
- +-----------------------------------------------+
- | PrincipalId (8 octets) |
- +-----------------------------------------------+
- | EffectivePrincipalId (8 octets) |
- +-----------------------------------------------+
- | Key (8 octets) |
- +-----------------------------------------------+
- | KeyTimeLimit |
- +-----------------------------------------------+
- | AuthDomain |
- +-----------------------------------------------+
- | AuthChecksum |
- +-----------------------------------------------+
-
- Figure III-1: Authenticator Format
-
- key encryption for data transmission.
-
- KeyTimeLimit: 32 bits
- The time in seconds since Dec. 31st,
- 1969 GMT at which one should cease to
- use the Key.
-
- AuthDomain: 32 bits
- The authentication domain in which to
- interpret the principal identifiers.
- This may be different from the
- authDomain specified in the call if the
- Server cannot provide the authentication
- information in the request domain.
-
- AuthChecksum: 32 bits
- Contains the checksum (using the same
- Checksum algorithm as for packet) of
- KeyTimeLimit, Key, PrincipalId and
- EffectivePrincipalId.
-
- Notes:
-
- 1. A authentication Probe Request and Response
- are sent unencrypted in general because it is
- used prior to there being a secure channel.
- Therefore, specific fields or groups of
- fields checksummed and encrypted to prevent
- unauthorized modification or forgery. In
-
-
- Cheriton [page 92]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- particular, the ProbeAuthenticator is
- checksummed and encrypted with the Key.
-
- 2. The ProbeAuthenticator authenticates the
- Response as responding to the Request when
- its EntityId, randomId and Transaction values
- match those in the Probe request. The
- ProbeAutenticator is bound to the
- EntityAutenticator by being encrypted by the
- private Key contained in that authenticator.
-
- 3. The authenticator is encrypted such that it
- can be decrypted by a private key, known to
- the Client. This authenticator is presumably
- obtained from a key distribution center that
- the Client trusts. The AuthChecksum prevents
- undetected modifications to the
- authenticator.
-
- 0x05000103 - ProbeEntityBlock( entityId ) -> ( code, entityId )
- Check whether the block of 256 entity identifiers
- associated with this entityId are in use. The entityId
- returned should match that being queried or else the
- return value should be ignored and the operation redone.
-
- 0x05000104 - QueryVMTPNode( entityId ) -> (code, MTU, flags, authdomain,
- domains, authdomains, domainlist)
- Query the VMTP management module for entityId to get
- various module- or node-wide parameters, including: (1)
- MTU - Maximum transmission unit or packet size handled
- by this node. (2) flags- zero or more of the following
- bit fields:
-
- 1 Handles streamed Requests.
-
- 2 Can issue streamed message transactions
- for clients.
-
- 4 Handles secure Requests.
-
- 8 Can issue secure message transactions.
-
- The authdomain indicates the primary authentication
- domain supported. The domains and authdomains
- parameters indicate the number of entity domains and
- authentication domains supported by this node, which are
- listed in the data segment parameter domainlist if
-
-
- Cheriton [page 93]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- either parameter is non-zero. (All the entity domains
- precede the authentication domains in the data segment.)
-
- 0x05000105 - GetRequestForwarder( CREntity, entityId1 ) -> (code,
- entityId2, principal, authDomain)
- Return the forwarding server's entity identifer and
- principal for the forwarder of entityId1. CREntity
- should be zero to get the local VMTP management module.
-
- 0x05000106 - CreateEntity( entityId1 ) -> ( code, entityId2 )
- Create a new entity and return its entity identifier in
- entityId2. The entity is created local to the entity
- specified in entityId1 and local to the requestor if
- entityId1 is 0.
-
- 0x05000107 - DeleteEntity( entityId ) -> ( code )
- Delete the entity specified by entityId, which may be a
- group. If a group, the deletion is only on a best
- efforts basis. The client must take additional measures
- to ensure complete deletion if required.
-
- 0x0D000108 -QueryEntity( entityId ) -> ( code, descriptor )
- Return a descriptor of entityId in arg of a maximum of
- segmentSize bytes.
-
- 0x05000109 - SignalEntity( entityId, arg )->( code )
- Send the signal specified by arg to the entity specified
- by entityId. (arg is 32 bits.)
-
- 0x0500010A - CreateGroup(CREntity,entityGroupId,entityId,perms)->(code)
- Request that the VMTP manager local to CREntity create
- an new entity group, using the specified entityGroupId
- with entityId as the first member and permissions
- "perms", a 32-bit field described later. The invoker is
- registered as a manager of the new group, giving it the
- permissions to add or remove members. (Normally
- CREntity is 0, indicating the VMTP manager local to the
- requestor.)
-
- 0x0500010B - AddToGroup(CREntity, entityGroupId, entityId,
- perms)->(code)
- Request that the VMTP manager local to CREntity add the
- specified entityId to the entityGroupId with the
- specified permissions. If entityGroupId specifies a
- restricted group, the invoker must have permission to
- add members to the group, either because the invoker is
-
-
- Cheriton [page 94]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- a manager of the group or because it was added to the
- group with the required permissions. If CREntity is 0,
- then the local VMTP manager checks permissions and
- forwards the request with CREntity set to entityId and
- the entityId field set to a digital signature (see
- below) of the Request by the VMTP manager, certifying
- that the Client has the permissions required by the
- Request. (If entityGroupId specifies an unrestricted
- group, the Request can be sent directly to the handling
- VMTP manager by setting CREntity to entityId.)
-
- 0x0500010C - RemoveFromGroup(CREntity, entityGroupId, entityId)->(code)
- Request that the VMTP manager local to CREntity remove
- the specified entityId from the group specified by
- entityGroupId. Normally CREntity is 0, indicating the
- VMTP manager local to the requestor. If CREntity is 0,
- then the local VMTP manager checks permissions and
- forwards the request with CREntity set to entityId and
- the entityId field a digital signature of the Request by
- the VMTP manager, certifying that the Client has the
- permissions required by the Request.
-
- 0x0500010D - QueryGroup( entityId )->( code, record )...
- Return information on the specified entity. The
- Response from each responding VMTP manager is (code,
- record). The format of the record is (memberCount,
- member1, member2, ...). The Responses are returned on a
- best efforts basis; there is no guarantee that responses
- from all managers with members in the specified group
- will be received.
-
- 0x0500010E - ModifyService(entityId,flags,count,pc,threadlist)->(code,
- count)
- Modify the service associated with the entity specified
- by entityId. The flags may indicate a message service
- model, in which case the call "count" parameter
- indicates the maximum number of queued messages desired;
- the return "count" parameter indicates the number of
- queued message allowed. Alternatively, the "flags"
- parameters indicates the RPC thread service model, in
- which case "count" threads are requested, each with an
- inital program counter as specified and stack, priority
- and message receive area indicated by the threadlist.
- In particular, "threadlist" consists of "count" records
- of the form
- (priority,stack,stacksize,segment,segmentsize), each one
- assigned to one of the threads. Flags defined for the
-
-
- Cheriton [page 95]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- "flags" parameter are:
-
- 1 THREAD_SERVICE - otherwise the message
- model.
-
- 2 AUTHENTICATION_REQUIRED - Sent a Probe
- request to determine principal
- associated with the Client, if not
- known.
-
- 4 SECURITY_REQUIRED - Request must be
- encrypted or else reject.
-
- 8 INCREMENTAL - treat the count value as
- an increment (or decrement) relative to
- the current value rather than an
- absolute value for the maximum number of
- queued messages or threads.
-
- In the thread model, the count must be a positive
- increment or else 0, which disables the service. Only a
- count of 0 terminates currently queued requests or
- in-progress request handling.
-
- 0x4500010F -
- NotifyVmtpClient(client,cntrl,recSeq,transact,delivery,code)->()
-
- Update the state associated with the transaction
- specified by client and transact, an entity identifier
- and transaction identifier, respectively. This
- operation is normally used only by another VMTP
- management module. (Note that it is a datagram
- operation.) The other parameters are as follows:
-
- ctrl A 32-bit value corresponding to 4th
- 32-bit word of the VMTP header of a
- Response packet that would be sent in
- response to the Request that this is
- responding to. That is, the control
- flags, ForwardCount, RetransmitCount and
- Priority fields match those of the
- Request. (The NRS flag is set if the
- receiveSeqNumber field is used.) The
- PGCount subfield indicates the number of
- previous Request packet groups being
- acknowledged by this Notify operation.
- (The bit fields that are reserved in
-
-
- Cheriton [page 96]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- this word in the header are also
- reserved here and must be zero.)
-
- recSeq Sequence number of reception at the
- Server if the NRS flag is set in the
- ctrl parameter, otherwise reserved and
- zero. (This is used for sender-based
- logging of message activity for replay
- in case of failure - an optional
- facility.)
-
- delivery Indicates the segment blocks of the
- packet group have been received at the
- Server.
-
- code indicates the action the client should
- take, as described below.
-
- The VMTP management module should take action on this
- operation according to the code, as specified below.
-
- OK Do nothing at this time, continue
- waiting for the response with a reset
- timer.
-
- RETRY Retransmit the request packet group
- immediately with at least the segment
- blocks that the Server failed to
- receive, the complement of those
- indicated by the delivery parameter.
-
- RETRY_ALL Retransmit the request packet group
- immediately with at least the segment
- blocks that the Server failed to
- receive, as indicated by the delivery
- field plus all subsequently transmitted
- packets that are part of this packet
- run. (The latter is applicable only for
- streamed message transactions.)
-
- BUSY The server was unable to accept the
- Request at this time. Retry later if
- desired to continue with the message
- transaction.
-
- NONEXISTENT_ENTITY
- Specified Server entity does not exist.
-
-
- Cheriton [page 97]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- ENTITY_MIGRATED The server entity has migrated and is no
- longer at the host to which the request
- was sent. The Server should attempt to
- determine the new host address of the
- Client using the VMTP management
- ProbeEntity operation (described
- earlier).
-
- NO_PERMISSION Server has not authorized reception of
- messages from this client.
-
- NOT_AWAITING_MSG
- The conditional message delivery bit was
- set for the Request packet group and the
- Server was not waiting for it so the
- Request packet group was discarded.
-
- VMTP_ERROR The Request packet group was in error
- relative to the VMTP protocol
- specification.
-
- BAD_TRANSACTION_ID
- Transaction identifier is old relative
- to the transaction identifier held for
- the Client by the Server.
-
- STREAMING_NOT_SUPPORTED
- Server does not support multiple
- outstanding message transactions from
- the same Client, i.e. streamed message
- transactions.
-
- SECURITY_NOT_SUPPORTED
- The Request was secure and this Server
- does not support security.
-
- SECURITY_REQUIRED
- The Server is refusing the Request
- because it was not encrypted.
-
- NO_RUN_RECORD Server has no record of previous packets
- in this run of packet groups. This can
- occur if the first packet group is lost
- or if the current packet group is sent
- significantly later than the last one
- and the Server has discarded its client
- state record.
-
-
- Cheriton [page 98]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- 0x45000110 - NotifyVmtpServer(server,client,transact,delivery,code)->()
- Update the server state associated with the transaction
- specified by client and transact, an entity identifier
- and transaction identifier, respectively. This
- operation is normally used only by another VMTP
- management module. (Note that it is a datagram
- operation.) The other parameters are as follows:
-
- delivery Indicates the segment blocks of the
- Response packet group that have been
- received at the Client.
-
- code indicates the action the Server should
- take, as listed below.
-
- The VMTP management module should take action on this
- operation according to the code, as specified below.
-
- OK Client is satisfied with Response data.
- The Server can discard the response
- data, if any.
-
- RETRY Retransmit the Response packet group
- immediately with at least the segment
- blocks that the Client failed to
- receive, as indicated by the delivery
- parameter. (The delivery parameter
- indicates those segment blocks received
- by the Client).
-
- RETRY_ALL Retransmit the Response packet group
- immediately with at least the segment
- blocks that the Client failed to
- receive, as indicated by the (complement
- of) the delivery parameter. Also,
- retransmit all Response packet groups
- send subsequent to the specified packet
- group.
-
- NONEXISTENT_ENTITY
- Specified Client entity does not exist.
-
- ENTITY_MIGRATED The Client entity has migrated and is no
- longer at the host to which the response
- was sent.
-
- RESPONSE_DISCARDED
-
-
- Cheriton [page 99]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- The Response was discarded and no longer
- of interest to the Client. This may
- occur if the conditional message
- delivery bit was set for the Response
- packet group and the Client was not
- waiting for it so the Response packet
- group was discarded.
-
- VMTP_ERROR The Response packet group was in error
- relative to the VMTP protocol
- specification.
-
- 0x41000111 -
- NotifyRemoteVmtpClient(client,ctrl,recSeq,transact,delivery,code->()
-
- The same as NotifyVmtpClient except the co-resident
- addressing is not used. This operation is used to
- update client state that is remote when a Request is
- forwarded.
-
- Note the use of the CRE bit in the RequestCodes to route the request to
- the correct VMTP management module(s) to handle the request.
-
-
- III.1. Entity Group Management
-
- An entity in a group has a set of permissions associated with its
- membership, controling whether it can add or remove others, whether it
- can remove itself, and whether others can remove it from the group. The
- permissions for entity groups are as follows:
- VMTP_GRP_MANAGER 0x00000001 { Manager of group. }
- VMTP_REM_BY_SELF 0x00000002 { Can be removed self. }
- VMTP_REM_BY_PRIN 0x00000004 { Can be rem'ed by same principal}
- VMTP_REM_BY_OTHE 0x00000008 { Can be removed any others. }
- VMTP_ADD_PRIN 0x00000010 { Can add by same principal. }
- VMTP_ADD_OTHE 0x00000020 { Can add any others. }
- VMTP_REM_PRIN 0x00000040 { Can remove same principal. }
- VMTP_REM_OTHE 0x00000080 { Can remove any others. }
-
- To remove an entity from a restricted group, the invoker must have
- permission to remove that entity and the entity must have permissions
- that allow it to be removed by that entity. With an unrestricted group,
- only the latter condition applies.
-
- With a restricted group, a member can only be added by another entity
- with the permissions to add other entities. The creator of a group is
- given full permissions on a group. A entity adding another entity to a
-
-
- Cheriton [page 100]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- group can only give the entity it adds a subset of its permissions.
- With unrestricted groups, any entity can add itself to the group. It
- can also add other entities to the group providing the entity is not
- marked as immune to such requests. (This is an implementation
- restriction that individual entities can impose.)
-
-
- III.2. VMTP Management Digital Signatures
-
- As mentioned above, the entityId field of the AddToGroup and
- RemoveFromGroup is used to transmit a digital signature indicating the
- permission for the operation has been checked by the sending kernel.
- The digital signature procedures have not yet been defined. This field
- should be set to 0 for now to indicate no signature after the CREntity
- parameter is set to the entity on which the operation is to be
- performed.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 101]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- IV. VMTP Entity Identifier Domains
-
- VMTP allows for several disjoint naming domains for its endpoints. The
- 64-bit entity identifier is only unique and meaningful within its
- domain. Each domain can define its own algorithm or mechanism for
- assignment of entity identifiers, although each domain mechanism must
- ensure uniqueness, stability of identifiers and host independence.
-
-
- IV.1. Domain 1
-
- For initial use of VMTP, we define the domain with Domain identifier 1
- as follows:
-
- +-----------+----------------+------------------------+
- | TypeFlags | Discriminator | Internet Address |
- +-----------+----------------+------------------------+
- 4 bits 28 bits 32 bits
-
- The Internet address is the Internet address of the host on which this
- entity-id is originally allocated. The Discriminator is an arbitrary
- value that is unique relative to this Internet host address. In
- addition, the host must guarantee that this identifier does not get
- reused for a long period of time after it becomes invalid. ("Invalid"
- means that no VMTP module considers in bound to an entity.) One
- technique is to use the lower order bits of a 1 second clock. The clock
- need not represent real-time but must never be set back after a crash.
- In a simple implementation, using the low order bits of a clock as the
- time stamp, the generation of unique identifiers is overall limited to
- no more than 1 per second on average. The type flags were described in
- Section 3.1.
-
- An entity may migrate between hosts. Thus, an implementation can
- heuristically use the embedded Internet address to locate an entity but
- should be prepared to maintain a cache of redirects for migrated
- entities, plus accept Notify operations indicating that migration has
- occurred.
-
- Entity group identifiers in Domain 1 are structured in one of two forms,
- depending on whether they are well-known or dynamically allocated
- identifiers. A well-known entity identifier is structured as:
-
- +-----------+----------------+------------------------+
- | TypeFlags | Discriminator |Internet Host Group Addr|
- +-----------+----------------+------------------------+
- 4 bits 28 bits 32 bits
-
-
-
- Cheriton [page 102]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- with the second high-order bit (GRP) set to 1. This form of entity
- identifier is mapped to the Internet host group address specified in the
- low-order 32 bits. The Discriminator distinguishes group identifiers
- using the same Internet host group. Well-known entity group identifiers
- should be allocated to correspond to the basic services provided by
- hosts that are members of the group, not specifically because that
- service is provided by VMTP. For example, the well-known entity group
- identifier for the domain name service should contain as its embedded
- Internet host group address the host group for Domain Name servers.
-
- A dynamically allocated entity identifier is structured as:
-
- +-----------+----------------+------------------------+
- | TypeFlags | Discriminator | Internet Host Addr |
- +-----------+----------------+------------------------+
- 4 bits 28 bits 32 bits
-
- with the second high-order bit (GRP) set to 1. The Internet address in
- the low-order 32 bits is a Internet address assigned to the host that
- dynamically allocates this entity group identifier. A dynamically
- allocated entity group identifier is mapped to Internet host group
- address 232.X.X.X where X.X.X are the low-order 24 bits of the
- Discriminator subfield of the entity group identifier.
-
- We use the following notation for Domain 1 entity identifiers <10> and
- propose it use as a standard convention.
-
- <flags>-<discriminator>-<Internet address>
-
- where <flags> are [X]{BE,LE,RG,UG}[A]
-
- X = reserved
- BE = big-endian entity
- LE = little-endian entity
- RG = restricted group
- UG = unrestricted group
- A = alias
-
- and <discriminator> is a decimal integer and <Internet address> is in
- standard dotted decimal IP address notation.
-
- Examples:
-
- _______________
-
- <10> This notation was developed by Steve Deering.
-
-
- Cheriton [page 103]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- BE-25593-36.8.0.49 is big-endian entity #25593 created on host
- 36.8.0.49.
-
- RG-1-224.0.1.0 is the well-known restricted VMTP managers group.
-
- UG-565338-36.8.0.77 is unrestricted entity group #565338 created on host
- 36.8.0.77.
-
- LEA-7823-36.8.0.77 is a little-endian alias entity #7823 created on host
- 36.8.0.77.
-
- This notation makes it easy to communicate and understand entity
- identifiers for Domain 1.
-
- The well-known entity identifiers specified to date are:
-
- VMTP_MANAGER_GROUP RG-1-224.0.1.0
- Managers for VMTP operations.
-
- VMTP_DEFAULT_BECLIENT BE-1-224.0.1.0
- Client entity identifier to use when a (big-endian) host
- has not determined or been allocated any client entity
- identifiers.
-
- VMTP_DEFAULT_LECLIENT LE-1-224.0.1.0
- Client entity identifier to use when a (little-endian)
- host has not determined or been allocated any client
- entity identifiers.
-
- Note that 224.0.1.0 is the host group address assigned to VMTP and to
- which all VMTP hosts belong.
-
- Other well-known entity group identifiers will be specified in
- subsequent extensions to VMTP and in higher-level protocols that use
- VMTP.
-
-
- IV.2. Domain 3
-
- Domain 3 is reserved for embedded systems that are restricted to a
- single network and are independent of IP. Entity identifiers are
- allocated using the decentralized approach described below. The mapping
- of entity group identifiers is specific to the type of network being
- used and not defined here. In general, there should be a simple
- algorithmic mapping from entity group identifier to multicast address,
- similar to that described for Domain 1. Similarly, the values for
- default client identifier are specific to the type of network and not
-
-
- Cheriton [page 104]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- defined here.
-
-
- IV.3. Other Domains
-
- Definition of additional VMTP domains is planned for the future.
- Requests for allocation of VMTP Domains should be addressed to the
- Internet protocol administrator.
-
-
- IV.4. Decentralized Entity Identifier Allocation
-
- The ProbeEntityBlock operation may be used to determine whether a block
- of entity identifiers is in use. ("In use" means valid or reserved by a
- host for allocation.) This mechanism is used to detect collisions in
- allocation of blocks of entity identifiers as part of the implementation
- of decentralized allocation of entity identifiers. (Decentralized
- allocation is used in local domain use of VMTP such as in embedded
- systems- see Domain 3.)
-
- Basically, a group of hosts can form a Domain or sub-Domain, a group of
- hosts managing their own entity identifier space or subspace,
- respectively. As an example of a sub-Domain, a group of hosts in Domain
- 1 all identified with a particular host group address can manage the
- sub-Domain corresponding to all entity identifiers that contain that
- host group address. The ProbeEntityBlock operation is used to allocate
- the random bits of these identifiers as follows.
-
- When a host requires a new block of entity identifiers, it selects a new
- block (randomly or by some choice algorithm) and then multicasts a
- ProbeEntityBlock request to the members of the (sub-)Domain some R
- times. If no response is received after R (re)transmissions, the host
- concludes that it is free to use this block of identifiers. Otherwise,
- it picks another block and tries again.
-
- Notes:
-
- 1. A block of 256 identifiers is specified by an entity
- identifier with the low-order 8 bits all zero.
-
- 2. When a host allocates an initial block of entity identifiers
- (and therefore does not yet have a specified entity
- identifier to use) it uses VMTP_DEFAULT_BECLIENT (if
- big-endian, else VMTP_DEFAULT_LECLIENT if little-endian) as
- its client identifier in the ProbeEntityBlock Request and a
- transaction identifier of 0. As soon as it has allocated a
- block of entity identifiers, it should use these identifiers
-
-
- Cheriton [page 105]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- for all subsequent communication. The default client
- identifier values are defined for each Domain.
-
- 3. The set of hosts using this decentralized allocation must not
- be subject to network partitioning. That is, the R
- transmissions must be sufficient to ensure that every host
- sees the ProbeEntityBlock request and (reliably) sends a
- response. (A host that detects a collision can retransmit
- the response multiple times until it sees a new
- ProbeEntityBlock operation from the same host/Client up to a
- maximum number of times.) For instance, a set of machines
- connected by a single local network may able to use this type
- of allocation.
-
- 4. To guarantee T-stability, a host must prevent reuse of a
- block of identifiers if any of the identifiers in the block
- are currently valid or have been valid less than T seconds
- previously. To this end, a host must remember recently used
- identifiers and object to their reuse in response to a
- ProbeEntityBlock operation.
-
- 5. Care is required in a VMTP implementation to ensure that
- Probe operations cannot be discarded due to lack of buffer
- space or queued or delayed so that a response is not
- generated quickly. This is required not only to detect
- collisions but also to provide accurate roundtrip estimates
- as part of ProbeEntity operations.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 106]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- V. Authentication Domains
-
- A VMTP authentication domain defines the format and interpretation for
- principal identifiers and encryption keys. In particular, an
- authentication domain must specify a means by which principal
- identifiers are allocated and guaranteed unique and stable. The
- currently defined authentication domains are as follows (0 is reserved).
-
- Ideally, all entities within one entity domain are also associated with
- one authentication domain. However, authentication domains are
- orthogonal to entity domains. Entities within one domain may have
- different authentication domains. (In this case, it is generally
- necessary to have some correspondence between principals in the
- different domains.) Also, one entity identifier may be associated with
- multiple authentication domains. Finally, one authentication domain may
- be used across multiple entity domains.
-
-
- V.1. Authentication Domain 1
-
- A principal identifier is structured as follows.
-
- +---------------------------+------------------------+
- | Internet Address | Local User Identifier |
- +---------------------------+------------------------+
- 32 bits 32 bits
-
- The Internet Address may specify an individual host (such as a UNIX
- machine) or may specify a host group address corresponding to a cluster
- of machines operating under a single adminstration. In both cases,
- there is assumed to be an adminstration associated with the embedded
- Internet address that guarantees the uniqueness and stability of the
- User Identifier relative to the Internet address. In particular, that
- administration is the only one authorized to allocate principal
- identifiers with that Internet address prefix, and it may allocate any
- of these identifiers.
-
- In authentication domain 1, the standard EncryptionQualifiers are:
-
- 0 Clear text - no encryption.
-
- 1 use 64-bit CBC DES for encryption and decryption.
-
-
- V.2. Other Authentication Domains
-
- Other authentication domains will be defined in the future as needed.
-
-
-
- Cheriton [page 107]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- VI. IP Implementation
-
- VMTP is designed to be implemented on the DoD IP Internet Datagram
- Protocol (although it may also be implemented as a local network
- protocol directly in "raw" network packets.)
-
- VMTP is assigned the protocol number 81.
-
- With a 20 octet IP header and one segment block, a VMTP packet is 600
- octets. By convention, any host implementing VMTP implicitly agrees to
- accept VMTP/IP packets of at least 600 octets.
-
- VMTP multicast facilities are designed to work with, and have been
- implemented using, the multicast extensions to the Internet [8]
- described in RFC 966 and 988. The wide-scale use of full VMTP/IP
- depends on the availability of IP multicast in this form.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 108]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- VII. Implementation Notes
-
- The performance and reliability of a protocol in operation is highly
- dependent on the quality of its implementation, in addition to the
- "intrinsic" quality of the protocol design. One of the design goals of
- the VMTP effort was to produce an efficiently implementable protocol.
- The following notes and suggestions are based on experience with
- implementing VMTP in the V distributed system and the UNIX 4.3 BSD
- kernel. The following is described for a client and server handling
- only one domain. A multi-domain client or server would replicate these
- structures for each domain, although buffer space may be shared.
-
-
- VII.1. Mapping Data Structures
-
- The ClientMap procedure is implemented using a hash table that maps to
- the Client State Record whether this entity is local or remote, as shown
- in Figure VII-1.
-
- +---+---+--------------------------+
- ClientMap | | x | |
- +---+-|-+--------------------------+
- | +--------------+ +--------------+
- +-->| LocalClient |--->| LocalClient |
- +--------------+ +--------------+
- | RemoteClient | | RemoteClient |-> ...
- +--------------+ +--------------+
- | | | |
- | | | |
- +--------------+ +--------------+
-
- Figure VII-1: Mapping Client Identifier to CSR
-
- Local clients are linked through the LocalClientLink, similarly for the
- RemoteClientLink. Once a CSR with the specified Entity Id is found,
- some field or flag indicates whether it is identifying a local or remote
- Entity. Hash collisions are handled with the overflow pointers
- LocalClientLink and RemoteClientLink (not shown) in the CSR for the
- LocalClient and RemoteClient fields, respectively. Note that a CSR
- representing an RPC request has both a local and remote entity
- identifier mapping to the same CSR.
-
- The Server specified in a Request is mapped to a server descriptor using
- the ServerMap (with collisions handled by the overflow pointer.). The
- server descriptor is the root of a queue of CSR's for handling requests
- plus flags that modify the handling of the Request. Flags include:
-
-
-
- Cheriton [page 109]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- +-------+---+-------------------------+
- ServerMap | | x | |
- +-------+-|-+-------------------------+
- | +--------------+
- | | OverflowLink |
- | +--------------+
- +-->| Server |
- +--------------+
- | Flags | Lock |
- +--------------+
- | Head Pointer |
- +--------------+
- | Tail Pointer |
- +--------------+
-
- Figure VII-2: Mapping Server Identifiers
-
- THREAD_QUEUE Request is to be invoked directly as a remote procedure
- invocation, rather than by a server process in the
- message model.
-
- AUTHENTICATION_REQUIRED
- Sent a Probe request to determine principal associated
- with the Client, if not known.
-
- SECURITY_REQUIRED
- Request must be encrypted or else reject.
-
- REQUESTS_QUEUED Queue contains waiting requests, rather than free CSR's.
- Queue this request as well.
-
- SERVER_WAITING The server is waiting and available to handle incoming
- Request immediately, as required by CMD.
-
- Alternatively, the Server identifiers can be mapped to a CSR using the
- MapToClient mechanism with a pointer in the CSR refering to the server
- descriptor, if any. This scheme is attractive if there are client CSR's
- associated with a service to allow it to communicate as a client using
- VMTP with other services.
-
- Finally, a similar structure is used to expand entity group identifiers
- to the local membership, as shown in Figure VII-3. A group identifier
- is hashed to an index in the GroupMap. The list of group descriptors
- rooted at that index in the GroupMap contains a group descriptor for
- each local member of the group. The flags are the group permissions
- defined in Appendix III.
-
-
-
- Cheriton [page 110]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- +-------+---+----------------------------------+
- GroupMap | | x | |
- +-------+-|-+----------------------------------+
- | +--------------+
- | | OverflowLink |
- | +--------------+
- +-->|EntityGroupId |
- +--------------+
- | Flags |
- +--------------+
- | Member Entity|
- +--------------+
-
- Figure VII-3: Mapping Group Identifiers
-
- Note that the same pool of descriptors could be used for the server and
- group descriptors given that they are similar in size.
-
-
- VII.2. Client Data Structures
-
- Each client entity is represented as a client state record. The CSR
- contains a VMTP header as well as other bookkeeping fields, including
- timeout count, retransmission count, as described in Section 4.1. In
- addition, there is a timeout queue, transmission queue and reception
- queue. Finally, there is a ServerHost cache that maps from server
- entity-id records to host address, estimated round trip time,
- interpacket gap, MTU size and (optimally) estimated processing time for
- this server entity.
-
-
- VII.3. Server Data Structures
-
- The server maintains a heap of client state records (CSR), one for each
- (Client, Transaction). (If streams are not supported, there is, at
- worst, a CSR per Client with which the server has communicated with
- recently.) The CSR contains a VMTP header as well as various
- bookkeeping fields including timeout count, retransmission count. The
- server maintains a hash table mapping of Client to CSR as well as the
- transmission, timeout and reception queues. In a VMTP module
- implementing both the client and server functions, the same timeout
- queue and transmission queue are used for both.
-
-
-
-
-
-
-
- Cheriton [page 111]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- VII.4. Packet Group transmission
-
- The procedure SendPacketGroup( csr ) transmits the packet group
- specified by the record CSR. It performs:
-
- 1. Fragmentation of the segment data, if any, into packets.
- (Note, segment data flagged by SDA bit.)
-
- 2. Modifies the VMTP header for each packet as required e.g.
- changing the delivery mask as appropriate.
-
- 3. Computes the VMTP checksum.
-
- 4. Encrypts the appropriate portion of the packet, if required.
-
- 5. Prepends and appends network-level header and trailer using
- network address from ServerHost cache, or from the responding
- CSR.
-
- 6. Transmits the packet with the interpacket gap specified in
- the cache. This may involve round-robin scheduling between
- hosts as well as delaying transmissions slightly.
-
- 7. Invokes the finish-up procedure specified by the CSR record,
- completing the processing. Generally, this finish-up
- procedure adds the record to the timeout queue with the
- appropriate timeout queue.
-
- The CSR includes a 32-bit transmission mask that indicates the portions
- of the segment to transmit. The SendPacketGroup procedure is assumed to
- handle queuing at the network transmission queue, queuing in priority
- order according to the priority field specified in the CSR record.
- (This priority may be reflected in network transmission behavior for
- networks that support priority.)
-
- The SendPacketGroup procedure only looks at the following fields of a
- CSR
-
- - Transmission mask
-
- - FuncCode
-
- - SDA
-
- - Client
-
- - Server
-
-
- Cheriton [page 112]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- - CoResidentEntity
-
- - Key
-
- It modifies the following fields
-
- - Length
-
- - Delivery
-
- - Checksum
-
- In the case of encrypted transmission, it encrypts the entire packet,
- not including the Client field and the following 32-bits.
-
- If the packet group is a Response, (i.e. lower-order bit of function
- code is 1) the destination network address is determined from the
- Client, otherwise the Server. The HostAddr field is set either from the
- ServerHost cache (if a Request) or from the original Request if a
- Response, before SendPacketGroup is called.
-
- The CSR includes a timeout and TTL fields indicating the maximum time to
- complete the processing and the time-to-live for the packets to be
- transmitted.
-
- SendPacketGroup is viewed as the right functionality to implement for
- transmission in an "intelligent" network interface.
-
- Finally, it appears preferable to be able to assume that all portions of
- the segment remain memory-resident (no page faults) during transmission.
- In a demand-paged systems, some form of locking is required to keep the
- segment data in memory.
-
-
- VII.5. VMTP Management Module
-
- The implementation should implement the management operations as a
- separate module that is invoked from within the VMTP module. When a
- Request is received, either from the local user level or the network,
- for the VMTP management module, the management module is invoked as a
- remote or local procedure call to handle this request and return a
- response (if not a datagram request). By registering as a local server,
- the management module should minimize the special-case code required for
- its invocation. The management module is basically a case statement
- that selects the operation based on the RequestCode and then invokes the
- specified management operation. The procedure implementing the
- management operation, especially operations like NotifyVmtpClient and
-
-
- Cheriton [page 113]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- NotifyVmtpServer, are logically part of the VMTP module because they
- require full access to the basic data structures of the VMTP
- implementation.
-
- The management module should be implemented so that it can respond
- quickly to all requests, particularly since the timing of management
- interactions is used to estimate round trip time. To date, all
- implementations of the management module have been done at the kernel
- level, along with VMTP proper.
-
-
- VII.6. Timeout Handling
-
- The timeout queue is a queue of CSR records, ordered by timeout count,
- as specified in the CSR record. On entry into the timeout queue, the
- CSR record has the timeout field set to the time (preferable in
- milliseconds or similar unit) to remain in the queue plus the finishup
- field set to the procedure to execute on removal on timeout from the
- queue. The timeout field for a CSR in the queue is the time relative to
- the record preceding it in the queue (if any) at which it is to be
- removed. Some system-specific mechanism decrements the time for the
- record at the front of the queue, invoking the finishup procedure when
- the count goes to zero.
-
- Using this scheme, a special CSR is used to timeout and scan CSR's for
- non-recently pinged CSR's. That is, this CSR times out and invokes a
- finishup procedure that scans for non-recently pinged CSR that are
- "AwaitingResponse" and signals the request processing entity and deletes
- the CSR. It then returns to the timeout queue.
-
- The timeout mechanism tends to be specific to an operating system. The
- scheme described may have to be adapted to the operating system in which
- VMTP is to be implemented.
-
- This mechanism handles client request timeout and client response
- timeout. It is not intended to handle interpacket gaps given that these
- times are expected to be under 1 millisecond in general and possibly
- only a few microseconds.
-
-
- VII.7. Timeout Values
-
- Roundtrip timeout values are estimated by matching Responses or
- NotifyVmtpClient Requests to Request transmission, relying on the
- retransmitCount to identify the particular transmission of the Request
- that generated the response. A similar technique can be used with
- Responses and NotifyVmtpServer Requests. The retransmitCount is
-
-
- Cheriton [page 114]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- incremented each time the Response is sent, whether the retransmission
- was caused by timeout or retransmission of the Request.
-
- The ProbeEntity request is recommended as a basic way of getting
- up-to-date information about a Client as well as predictable host
- machine turnaround in processing a request. (VMTP assumes and requires
- an efficient, bounded response time implementation of the ProbeEntity
- operation.)
-
- Using this mechanism for measuring RTT, it is recommended that the
- various estimation and smoothing techniques developed for TCP RTT
- estimation be adapted and used.
-
-
- VII.8. Packet Reception
-
- Logically a network packet containing a VMTP packet is 5 portions:
-
- - network header, possibly including lower-level headers
-
- - VMTP header
-
- - data segment
-
- - VMTP checksum
-
- - network trailer, etc.
-
- It may be advantageous to receive a packet fragmented into these
- portions, if supported by the network module. In this case, ideally the
- VMTP header may be received directly into a CSR, the data segment into a
- page that can be mapped, rather than copied, to its final destination,
- with VMTP checksum and network header in a separate area (used to
- extract the network address corresponding to the sender).
-
- Packet reception is described in detail by the pseudo-code in Section
- 4.7.
-
- With a response, normally the CSR has an associated segment area
- immediately available so delivery of segment data is immediate.
- Similarly, server entities should be "armed" with CSR's with segment
- areas that provide for immediate delivery of requests. It is reasonable
- to discard segment data that cannot be immediately delivered in this
- way, providing that clients and servers are able to preallocate CSR's
- with segment areas for requests and responses. In particular, a client
- should be able to provide some number of additional CSR's for receiving
- multiple responses to a multicast request.
-
-
- Cheriton [page 115]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- The CSR data structure is intended to be the interface data structure
- for an intelligent network interface. For reception, the interface is
- "armed" with CSR's that may point to segment areas in main memory, into
- which it can deliver a packet group. Ideally, the interface handles all
- the processing of all packets, interacting with the host after receiving
- a complete Request or Response packet group. An implementation should
- use an interface based on SendPacketGroup(CSR) and
- ReceivePacketGroup(CSR) to facilitate the introduction of an intelligent
- network interface.
-
- ReceivePacketGroup(csr) provides the interface with a CSR descriptor and
- zero or more bytes of main memory to receive segment data. The CSR
- describes whether it is to receive responses (and if so, for which
- client) or requests (and if so for which server).
-
- The procedure ReclaimCSR(CSR) reclaims the specified record from the
- interface before it has been returned after receiving the specified
- packet group.
-
- A finishup procedure is set in the CSR to be invoked when the CSR is
- returned to the host by the normal processing sequence in the interface.
- Similarly, the timeout parameter is set to indicate the maximum time the
- host is providing for the routine to perform the specified function.
- The CSR and associated segment memory is returned to the host after the
- timeout period with an indication of progress after the timeout period.
- It is not returned earlier.
-
-
- VII.9. Streaming
-
- The implementation of streaming is optional in both VMTP clients and
- servers. Ideally, all performance-critical servers should implement
- streaming. In addition, clients that have high context switch overhead,
- network access overhead or expect to be communicating over long delay
- links should also implement streaming.
-
- A client stream is implemented by allocating a CSR for each outstanding
- message transaction. A stream of transactions is handled similarly to
- multiple outstanding transactions from separate clients except for the
- interaction between consecutive numbered transactions in a stream.
-
- For the server VMTP module, streamed message transactions to a server
- are queued (if accepted) subordinate to the first unprocessed CSR
- corresponding to this Client. Thus, streamed transactions from a given
- Client are always performed in the order specified by the transaction
- identifiers.
-
-
-
- Cheriton [page 116]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- If a server does not implement streaming, it must refuse streamed
- message transactions using the NotifyVmtpClient operation. Also, all
- client VMTP's that support streaming must support the streamed interface
- to a server that does not support streaming. That is, it must perform
- the message transactions one at a time. Consequently, a program that
- uses the streaming interface to a non-streaming server experiences
- degraded performance, but not failure.
-
-
- VII.10. Implementation Experience
-
- The implementation experience to date includes a partial implementation
- (minus the streaming and full security) in the V kernel plus a similar
- preliminary implementation in the 4.3 BSD Unix kernel. In the V kernel
- implementation, the CSR's are part of the (lightweight) process
- descriptor.
-
- The V kernel implementation is able to perform a VMTP message
- transaction with no data segment between two Sun-3/75's connected by 10
- Mb Ethernet in 2.25 milliseconds. It is also able to transfer data at
- 4.7 megabits per second using 16 kilobyte Requests (but null checksums.)
- The UNIX kernel implementation running on Microvax II's achieves a basic
- message transaction time of 9 milliseconds and data rate of 1.9 megabits
- per second using 16 kilobyte Responses. This implementation is using
- the standard VMTP checksum.
-
- We hope to report more extensive implementation experience in future
- revisions of this document.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 117]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- VIII. UNIX 4.3 BSD Kernel Interface for VMTP
-
- UNIX 4.3 BSD includes a socket-based design for program interfaces to a
- variety of protocol families and types of protocols (streams,
- datagrams). In this appendix, we sketch an extension to this design to
- support a transaction-style protocol. (Some familiarity with UNIX 4.2/3
- IPC is assumed.) Several extensions are required to the system
- interface, rather than just adding a protocol, because no provision was
- made for supporting transaction protocols in the original design. These
- extensions include a new "transaction" type of socket plus new system
- calls invoke, getreply, probeentity, recreq, sendreply and forward.
-
- A socket of type transaction bound to the VMTP protocol type
- IPPROTO_VMTP is created by the call
-
- s = socket(AF_INET, SOCK_TRANSACT, VMTP);
-
- This socket is bound to an entity identifier by
-
- bind(s, &entityid, sizeof(entityid));
-
- The first address/port bound to a socket is considered its primary name
- and is the one used on packet transmission. A message transaction is
- invoked between the socket named by s and the Server specified by mcb by
-
- invoke(s, mcb, segptr, seglen, timeout );
-
- The mcb is a message control block whose format was described in Section
- 2.4. The message control block specifies the request to send plus the
- destination Server. The response message control block returned by the
- server is stored in mcb when invoke returns. The invoking process is
- blocked until a response is received or the message transaction times
- out unless the request is a datagram request. (Non-blocking versions
- with signals on completion could also be provided, especially with a
- streaming implementation.)
-
- For multicast message transactions (sent to an entity group), the next
- response to the current message transaction (if it arrives in less than
- timeout milliseconds) is returned by
-
- getreply( s, mcb, segptr, maxseglen, timeout );
-
- The invoke operation sent to an entity group completes as soon as the
- first response is received. A request is retransmitted until the first
- reply is received (assuming the request is not a datagram). Thus, the
- system does not retransmit while getreply is timing out even if no
- replies are available.
-
-
- Cheriton [page 118]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- The state of an entity associated with entityId is probed using
-
- probeentity( entityId, state );
-
- A UNIX process acting as a VMTP server accepts a Request by the
- operation
-
- recvreq(s, mcb, segptr, maxseglen );
-
- The request message for the next queued transaction request is returned
- in mcb, plus the segment data of maximum length maxseglen, starting at
- segptr in the address space. On return, the message control block
- contains the values as set in invoke except: (1) the Client field
- indicates the Client that sent the received Request message. (2) the
- Code field indicates the type of request. (3) the MsgDelivery field
- indicates the portions of the segment actually received within the
- specified segment size, if MDM is 1 in the Code field. A segment block
- is marked as missing (i.e. the corresponding bit in the MsgDelivery
- field is 0) unless it is received in its entirety or it is all of the
- data in last segment contained in the segment.
-
- To complete a transaction, the reply specified by mcb is sent to the
- client specified by the MCB using
-
- sendreply(s, mcb, segptr );
-
- The Client field of the MCB indicates the client to respond to.
-
- Finally, a message transaction specified by mcb is forwarded to
- newserver as though it were sent there by its original invoker using
-
- forward(s, mcb, segptr, timeout );
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 119]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Index
-
- Acknowledgment 14
- APG 16, 31, 39
- Authentication domain 20
-
- Big-endian 9
-
- Checksum 14, 43
- Checksum, not set 44
- Client 7, 10, 38
- Client timer 16
- CMD 42, 110
- CMG 32, 40
- Co-resident entity 25
- Code 42
- CoResidentEntity 42, 43
- CRE 21, 42
-
- DGM 42
- Digital signature, VMTP management 95, 101
- Diskless workstations 2
- Domain 9, 38
- Domain 1 102
- Domain 3 104
-
- Entity 7
- Entity domain 9
- Entity group 8
- Entity identifier 37
- Entity identifier allocation 105
- Entity identifier, all-zero 38
- EPG 20, 39
-
- Features 6
- ForwardCount 24
- Forwarding 24
- FunctionCode 41
-
- Group 8
- Group message transaction 10
- Group timeouts 16
- GRP 37
-
- HandleNoCSR 62
- HandleRequestNoCSR 79
- HCO 14, 23, 39
-
-
- Cheriton [page 120]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Host independence 8
-
- Idempotent 15
- Interpacket gap 18, 40
- IP 108
-
- Key 91
-
- LEE 32, 37
- Little-endian 9
-
- MCB 118
- MDG 22, 40
- MDM 30, 42
- Message control block 118
- Message size 6
- Message transaction 7, 10
- MPG 39
- MsgDelivery 43
- MSGTRANS_OVERFLOW 27
- Multicast 4, 21, 120
- Multicast, reliable 21
-
- Naming 6
- Negative acknowledgment 31
- NER 25, 31, 39
- NRT 26, 30, 39
- NSR 25, 27, 31, 39
-
- Object-oriented 2
- Overrun 18
-
- Packet group 7, 29, 39
- Packet group run 31
- PacketDelivery 29, 31, 41
- PGcount 26, 41
- PIC 42
- Principal 11
- Priority 41
- Process 11
- ProcessId 89
- Protocol number,IP 108
-
- RAE 37
- Rate control 18
- Real-time 2, 4
- Realtime 22
-
-
- Cheriton [page 121]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- Reliability 12
- Request message 10
- RequestAckRetries 30
- RequestRetries 15
- Response message 10
- ResponseAckRetries 31
- ResponseRetries 15
- Restricted group 8
- Retransmission 15
- RetransmitCount 17
- Roundtrip time 17
- RPC 2
- Run 31, 39
- Run, message transactions 25
-
- SDA 42
- Security 4, 19
- Segment block 41
- Segment data 43
- SegmentSize 42, 43
- Selective retransmission 18
- Server 7, 10, 41
- Server group 8
- Sockets, VMTP 118
- STI 26, 40
- Streaming 25, 55
- Strictly stable 8
- Subgroups 21
-
- T-stable 8
- TC1(Server) 16
- TC2(Server) 16
- TC3(Server) 16
- TC4 16
- TCP 2
- Timeouts 15
- Transaction 10, 41
- Transaction identification 10
- TS1(Client) 17
- TS2(Client) 17
- TS3(Client) 17
- TS4(Client) 17
- TS5(Client) 17
- Type flags 8
-
- UNIX interface 118
- Unrestricted group 8, 38
-
-
- Cheriton [page 122]
-
-
-
- RFC 1045 VMTP February 1988
-
-
- NotifyVmtpClient 7, 26, 27, 30
- NotifyVmtpServer 7, 14, 30
- User Data 43
-
- Version 38
- VMTP Management digital signature 95, 101
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Cheriton [page 123]
-
-